Next Article in Journal
Isolation and Characterization of 11 New Microsatellite Loci in Erigeron breviscapus (Asteraceae), an Important Chinese Traditional Herb
Next Article in Special Issue
Spectroscopic Study of Solvent Effects on the Electronic Absorption Spectra of Flavone and 7-Hydroxyflavone in Neat and Binary Solvent Mixtures
Previous Article in Journal
Use of Oligonucleotides Carrying Photolabile Groups for the Control of the Deposition of Nanoparticles in Surfaces and Nanoparticle Association
Previous Article in Special Issue
Nitrogen Substituted Phenothiazine Derivatives: Modelling of Molecular Self-Assembling
Article Menu

Export Article

Int. J. Mol. Sci. 2011, 12(10), 7250-7264; doi:10.3390/ijms12107250

Article
Estimating the Octanol/Water Partition Coefficient for Aliphatic Organic Compounds Using Semi-Empirical Electrotopological Index
Erica Silva Souza 1, Laize Zaramello 2, Carlos Alberto Kuhnen 2, Berenice da Silva Junkes 3, Rosendo Augusto Yunes 1 and Vilma Edite Fonseca Heinzen 1,*
1
Departamento de Química, Universidade Federal de Santa Catarina, Campus Universitário, Trindade, Florianópolis, SC 88040-970, Brazil
2
Departamento de Física, Universidade Federal de Santa Catarina, Campus Universitário, Trindade, Florianópolis, SC 88040-970, Brazil
3
Instituto Federal de Educação, Ciência e Tecnologia de Santa Catarina, Avenida Mauro Ramos 950, Florianópolis, SC 88020-300, Brazil
*
Author to whom correspondence should be addressed; Tel.: +55-48-3721-6849; Fax: +55-48-3721-6850.
Received: 8 September 2011; in revised form: 8 October 2011 / Accepted: 14 October 2011 / Published: 24 October 2011

Abstract

: A new possibility for estimating the octanol/water coefficient (log P) was investigated using only one descriptor, the semi-empirical electrotopological index (ISET). The predictability of four octanol/water partition coefficient (log P) calculation models was compared using a set of 131 aliphatic organic compounds from five different classes. Log P values were calculated employing atomic-contribution methods, as in the Ghose/Crippen approach and its later refinement, AlogP; using fragmental methods through the ClogP method; and employing an approach considering the whole molecule using topological indices with the MlogP method. The efficiency and the applicability of the ISET in terms of calculating log P were demonstrated through good statistical quality (r > 0.99; s < 0.18), high internal stability and good predictive ability for an external group of compounds in the same order as the widely used models based on the fragmental method, ClogP, and the atomic contribution method, AlogP, which are among the most used methods of predicting log P.
Keywords:
quantitative structure-property relationship; n-octanol-water partition coefficient; semi-empirical electrotopological index

1. Introduction

The logarithm of the molecular 1-octanol-water partition coefficient (log P) of compounds, which is a measure of hydrophobicity, is widely used in numerous Quantitative Structure-Activity Relationship (QSAR) models for predicting the pharmaceutical properties of molecules [17]. In medicinal chemistry there is continued interest in developing methods of deriving log P based on molecular structure. From the experimental point of view the equilibrium methods for the determination of partition coefficients are difficult or, in some cases, impossible, as in the case of instable compounds or due to impurities. Other difficulties are associated with the formation of stable emulsions after shaking or compounds which have a strong preference for one of the phases of the system. Thus, the agreement between the theoretical and experimental approaches to the determination of partition coefficients continues to be a focus of scientific interest [8]. Despite the huge amount of experimental data on the log P values of organic structures, this is still insufficient compared with the number of compounds for which log P is of interest [5]. The first method of calculating log P was the π-system, developed by Hansch and Fujita [9,10]. Several different methods for calculating the log P values from chemical structure have in common that molecules are cut into groups or atoms; summing the fragmental or single-atom contribution results, to give the final log P value.

The most widely used method for calculating log P is the fragmental method [11], which is based on the additive constitutive properties of log P. In the case of the atomic-contribution method [12] the atom type is used instead of a fragment. This approach was developed in an effort to attribute properties to an atom within a molecular structure and most of these methods do not use correction factors, as in the fragmental methods. The more recent approaches consider the molecule as a whole. These models attempt to make theoretical estimations of log P, using graph-theoretical descriptors, molecular properties or quantum-chemical descriptors to quantify log P, some methods incorporating the effects of the three-dimensional structure and the electronic properties of the molecule [1322]. Several researchers have compared the predictive ability of log P calculation models. A review was published by Mannhold and Waterbeemd in 2001 comparing log P calculations obtained from different models [5].

Recently, a new topological index, called the semi-empirical electrotopological index (ISET), was developed by our research group in order to obtain a molecular descriptor not directly related to the chromatographic retention indices (RI) but based on values calculated by quantum mechanics to obtain Quantitative Structure-Property Relationship (QSPR) for different classes of organic compounds. This new approach takes into account the charges of the heteroatom and the carbon atoms attached to them through the definition of an equivalent local dipole moment [2326].

The main goal of this study is to compare the predictive power of four log P calculation models and ISET for a set of 131 aliphatic organic compounds from five different classes. The external validation of the models is performed using the cross-validation coefficient, rcv 2, and seven experimental log P values for aliphatic alcohols are calculated, which are not included in the training sets for each model.

2. Methods

The QSPR study of these aliphatic organic compounds was performed with the selection of the data set, generation of molecular descriptors, simple linear regression statistical analysis and model validation techniques. The model applicability was further examined by plotting predicted data against experimental data for all of the compounds. All regression analysis was carried out using the Origin [27] and TSAR programs [28]. The statistical parameters used to test the prediction efficiency of the models obtained were the correlation coefficient (r), standard deviation (s), coefficient of determination (r2) and null hypothesis test (F-test). The validity of the model was tested with the cross-validation coefficient (rcv 2) using “leave-one-out” in the software program TSAR 3.3 for windows [28]. A group of seven compounds, not included in the original QSPR models, was employed for the external validation.

2.1. Data Set and Calculation Models

The experimental Log P values for the organic compound groups studied herein were taken from the literature [6,7]. Theoretical values of log P for 131 aliphatic organic compounds were obtained using four log P calculation models. Log P calculation methods can be roughly divided into two major classes: substructure approaches which have in common that molecules are cut into groups (fragmental methods) or atoms (atomic-contribution methods) (property-based models); and whole-molecule approaches that consider the entire molecule using molecular lipophilicity potentials, topological indices or molecular properties. Atomic-contribution methods do not usually require correction factors. The almost identical methodological background of the fragmental and atomic-contribution methods indicates their interchangeability.

Log P values were calculated employing atomic-contribution methods as in the Ghose/Crippen approach [12] (available in the Hyperchem package [29]) or its later refinement, AlogP [30,31], and using fragmental methods such as the ClogP method [32] available in the Osiris Property Explorer package [33]. ClogP and AlogP methods are among the most prominent methods of predicting log P. Both methods have been implemented as part of free and commercial software programs for molecular modeling applications [29,33,34]. Values of log P derived from the whole-molecule approach were calculated using topological indices as in the MlogP method [35]. AlogP and MlogP are available in the VCCLAB on-line software package (ALOGPS 2.1 program) [34]. The calculated and the experimental log P values for 131 organic compounds in the test set are shown in Table 1. The theoretical values were then determined using the models of Ghose/Crippen, AlogP, ClogP, MlogP and the present model through the ISET molecular descriptor. As can be seen in Table 1, some experimental log P values are missing, which may be related to the inherent difficulties associated with the determination of log P for certain compounds. However, their calculated values are included herein to allow future comparison with experimental values.

2.2. Semi-Empirical Electrotopological Index, ISET

In this study, the new descriptor, that is, the recently developed electrotopological index, ISET [2326], is applied to QSPR studies to predict the octanol/water partition coefficient, Log P, for a large amount of organic compounds, including aliphatic hydrocarbons such as alkanes and alkenes, aldehydes, ketones, esters and alcohols. This new descriptor can be quickly calculated for this series of molecules from the semi-empirical, quantum-chemical, AM1 method and correlated with the approximate numerical values attributed by the semi-empirical topological index to the primary, secondary, tertiary and quaternary carbon atoms. Thus, unifying the quantum-chemical with the topological method gives a three-dimensional picture of the atoms in the molecule [23]. It is important to note that the AM1 method gives more reliable semi-empirical charges, dipoles and bond lengths than those obtained from time-consuming, low-quality, ab initio methods, that is, when employing a minimal basis set in ab initio calculations [36]. Despite the fact that the calculated partial atomic charges may be less reliable than other molecular properties, and that different semi-empirical methods give values for the net charges with poor numerical agreement, it is important to recognize that their calculation is easy and that the values at least indicate trends in the charge density distributions in the molecules. Since many chemical reactions or physico-chemical properties are strongly dependent on local electron densities, net atomic charges and other charge-based descriptors are currently used as chemical reactivity indices [37].

For alkanes and alkenes, this correlation has allowed the creation of a new semi-empirical electrotopological index (ISET) for QSRR models [20] based on the fact that the interactions between the solute and the stationary phase are due to electrostatic and dispersive forces. This new index, ISET, is able to distinguish between the cis- and trans-isomers directly from the values of the net atomic charges of the carbon atoms that are obtained from quantum-chemical calculations. For polar molecules like aldehydes, ketones, esters and alcohols, the presence of heteroatoms like oxygen changes considerably the charge distribution of the corresponding hydrocarbons giving a partial increase in the interactions between the solute and the stationary phase. An appropriate way to calculate the ISET was developed, which takes into account the dipole moment exhibited by these molecules and the atomic charges of the heteroatoms and the carbon atoms attached to them. By considering the stationary phase as a non-polar material, the interaction between these molecules and the stationary phase are electrostatic with a contribution from dispersive forces. These interactions slowly increase relative to the corresponding hydrocarbons. Hence, the interactions between the molecules and the stationary phase slowly increase and, clearly, this is due to the charge redistribution that occurs in the presence of the heteroatom. This charge redistribution accounts for the dipole moment of the molecules. The dispersive force between these kinds of molecules and the stationary phase includes the charge-dipole interactions and dipole-induced dipole interactions, which are weak relative to the electrostatic interactions. Thus, the dipolar charge distribution in such molecules leads to a small increase in the interactions of the solute with the stationary phase relative to hydrocarbons where the dipole moment is zero, or almost zero. Clearly, the major effects on the charge distribution due to the presence of the (oxygen) heteroatoms occur in its neighborhood and the excess charge at these atoms leads to electrostatic interactions that are stronger than the weak dispersive dipolar interactions.

For aldehydes, ketones, esters and alcohols all these factors were included in the calculation of the retention index through a small increase in the values for the atomic descriptor (named SETi) for the heteroatoms and carbon atom attached to them [2426]. This was achieved by multiplying the SETi values of these atoms by a function Aμ which is logarithmically dependent on the dipole moment of the molecule and the net charge at the oxygen and carbon atoms (to include both the electrostatic and dispersive interactions) that are embodied in the definition of the local dipole moment μF [2426]. In this approach the dispersive dipolar interactions were included in the calculation of the retention index by multiplying the SETi values of the heteroatoms (oxygen) and carbon atoms attached to the heteroatoms by the dipolar function Aμ. That is, in this model the ISET is calculated as in Equation 1,

I SET = I SET i = i , j ( A μ SET i + log A μ SET j )

where the SETi values are obtained through a linear relationship with the net atomic charge obtained from AM1 calculations [1821]. In Equation 1,Aμ is logarithmically dependent on the dipole moment of the molecule, as in Equation 2:

A μ = 1 + log ( 1 + μ μ F )

where μ is the calculated molecular dipole moment and μF is the equivalent local dipole moment which is dependent on the charges of the atoms belonging to the C-heteroatom group. In the above expression for the ISET (Equation 1) the dipolar function Aμ is taken as the unit for the remaining carbon atoms of the molecules. The various definitions of the local dipole moment μF are given in previous papers concerned with the retention index of aldehydes, ketones, esters and alcohols [2426].

For the ISET model, the AM1 semi-empirical calculations of the net atomic charges were performed using the Hyperchem software package [29]. The initial geometries were obtained through molecular mechanics (MM+) calculations, being subsequently optimized using the AM1 method [36,38], employing the Polak-Ribiere algorithm and gradient minimization techniques with a convergence limit of 0.0001 and RMS gradient of 0.0001 kcal (A mol)−1. Mulliken population analysis was employed to obtain the net atomic charge of the carbon atoms and oxygen atoms. The net atomic charge (Qi) is obtained from the difference between the electronic charge of the isolated atom (Z) and the calculated charge of the bound atom (qi), that is, Qi = Z − qi. The SETi values for each atom are obtained from Equation 2 using the AM1 net atomic charges (Qi). Employing AM1 calculations these quantities are more easily obtained for a large number of molecules of reasonable size compared with those obtained when employing a minimal basis set in ab initio calculations [36]. Despite of the usually limited quantitative accuracy of semi-empirical methods the computational efficiency available nowadays [35] enables electronic properties of a large number of molecules to be obtained in a reasonable amount of time, and computational time is an important feature when developing models of quantitative structure-activity relationships (QSAR)[37].

3. Results and Discussion

The 3-hexanone molecule represented in the graph below is taken as an example of the ISET calculation using the present approach. The net atomic charges and SETi values are given in Table II of the reference 24.

Ijms 12 07250f1 1024
μ F = d Q C - Q O

μF = 1.2342 |0.224 − [−0.288]| = 0.6319

Aμ = 1 + log[1 + (2.6790/0.63191)] = 1.7193

ISETO1 = (=O) = AμSETO1 + log AμSETC3 = 1.9507 + log 0.3899 = 1.5416

ISETC1 = (−CH3) = SETC1 + log SETC2 = 0.9892 + log 0.9998 = 0.9891

ISETC2 = (−CH2−) = SETC2 + log SETC1+ log AμSETC3 = 0.9998 + log 0.9892 + log 0.3899 = 0.5860

ISETC3 = (>C<) = AμSETC3 + log SETC2 + log AμSETO1 + log SETC4 = 0.3899 + log 0.9998 + log 1.9507 + log 0.9998 = 0.6799

ISETC4 = (−CH2−) = SETC4 + log AμSETC3 + log SETC5 = 0.9998 + log 0.3899 + log 0.8988 = 0.5444

ISETC5 = (−CH2−) = SETC5 + log SETC4 + log SETC6 = 0.8988 + log 0.9998 + log 0.9998 = 0.8986

ISETC6 = (−CH3) = SETC6 + log SETC5 = 0.9998 + log 0.8988 = 0.9535

ISET = 1.5416 + 0.9891 + 0.5860 + 0.6799 + 0.5444 + 0.8986 + 0.9535 = 6.1931

The results obtained in the statistical analysis of the single linear regression between experimental and calculated Log P values using ISET are shown in Table 2 for each class of compounds studied. They indicate that the theoretical partition coefficients calculated using the ISET method give good agreement with the experimental partition coefficients. The QSPR models obtained with ISET showed high values for the correlation coefficient (r > 0.99), and the leave-one-out cross-validation demonstrate that the final models are statistically significant and reliable (rcv 2 > 0.98). As can be observed, this model explains more than 99% of the variance in the experimental values for this set of compounds. Among the various classes of compounds the best results obtained with the ISET method are for hydrocarbons (Table 2), which is related to the fact that the present model was developed initially for this class of organic compounds. Values of r = 0.9986 and s = 0.10 were obtained for hydrocarbons, which are the lowest values considering the other four models.

The present results can be compared with those recently published for a new approach based on the Kovats retention indices, which uses multiple linear regressions [7], where reportedly for 37 hydrocarbons s = 0.46, for 11 aldehydes s = 0.27, for 27 alcohols s = 0.32 and for 13 esters s = 0.17. As can be seen in Table 2, the lowest standard deviation was obtained for the aldehydes correlation (s = 0.05) and for alcohols the correlation was greater (s = 0.18). The range of standard deviations obtained verifies the applicability of the present approach to different classes of organic compounds. For alcohols, the earlier approach of Duchowicz et al. [6], based on the concept of flexible topological descriptors and on the optimization of correlation weights of local graphic invariants, is applied to model the octanol/water partition coefficient of a representative set of 62 alcohols, resulting in a satisfactory prediction with a standard deviation of 0.22. Recently, Liu et al. [39] carried out a QSPR study to predict the log P for 58 aliphatic alcohols using novel molecular indices based on graph theory, by dividing the molecular structure into substructures obtaining models with good stability and robustness, and values predicted using the multiple linear regression method are close to the experimental values (r = 0.9959 and s = 0.15). The above results show the reliability of the present model calculation based on the semi-empirical calculation of atomic charges and local dipole moments using only one descriptor, ISET.

The statistical analysis for the predictive ability of four log P calculation models and ISET for a set of 131 aliphatic organic compounds from five different classes are summarized in Table 2. The AlogP method gives a stable performance for all classes of organic compounds tested, with much less variability in the statistical quality of results among different subclasses (r > 0.98 and s < 0.22). The ClogP method offers good predictability (r > 0.99 and s < 0.17), giving larger deviations only in the case of ketones (r = 0.955; s = 0.40). The MlogP and Ghose/Crippen methods have much larger deviations (r > 0.974 and s < 0.39) in comparison with the other methods.

The experimental and predicted log P values using ISET and the other four models (and the respective deviations) for an external group of alcohols are shown in Table 3. The Ghose/Crippen method and its refinement AlogP shows appreciable deviations for 1-undecanol and 4,4-dimethyl-1-pentanol, respectively, whereas the ClogP values are greater for branched alcohols. For the three last branched alcohols in Table 3 the whole molecule approach MLogP, which employs an MLR with final regression equation involving 13 parameters, gives the same value for Log P, being unable to distinguish the structural differences between these branched alcohols. The average standard deviation of calculated Log P for the seven alcohols of Table 3 using the ISET model is 0.15, whereas for the Ghose/Crippen method it is 0.34. The AlogP method, which is applicable to most neutral organic compounds and selective charged compounds, shows an average standard deviation of 0.26. In contrast, the ClogP method, which uses a large number of parameters and correction factors, results in a standard deviation of 0.17, while for the whole molecule approach the value is 0.24. These results demonstrate that the predictability of the present model for polar aliphatic organic compounds has the same pattern of accuracy as the widely used ClogP model.

The predictive ability of a QSPR model can be estimated using an external test set of compounds that has not been used for building the model. According to Tropsha and Golbraikh [40] a high value of cross-validated r2 (q2) alone is insufficient criterion for a QSAR model to be considered highly predictive, and the use of an external set of compounds for the model validation is always necessary. The authors’ state that the correlation coefficient, r, between the predicted and observed activities of compounds from an external test set should be close to 1 [40,41]. Following these authors, we considered seven compounds not included in the original model (Table 3) plotting observed vs. predicted log P values obtaining Y = 1.0273X − 0.1223 with r2 = 0.9858 and Y = 0.9893X (with the intercept set to 0) with r2 = 0.9842. Predicted vs. observed log P values, Y = 0.9596X + 0.1557 with r2 = 0.9858 and Y = 1.008X with r2 = 0.9828 were plotted. The QSPR model has a value of cross-validated (using leave-one-out), rcv 2 = 0.9870 showing that the model has high predictive power.

4. Conclusions

The efficiency and the applicability of the descriptor ISET in terms of predicting log P using the quantitative structure-activity relationship (QSPR) were demonstrated through the good statistical quality and high internal stability obtained for the studied classes of compounds as well as the good predictive ability for the external group of compounds. The ISET model also has the advantage of simplicity, using only one descriptor, and it has statistical quality of the same order as the widely used models based on the fragmental method, ClogP, and the atomic-contribution method, AlogP. The quality of the results obtained can be considered appropriate for the development of QSPR models for other compounds in the future.

Acknowledgments

The authors wish to thank CNPq (National Council for Scientific and Technological Development, Brazil) for financial support.

References

  1. Sakuratani, Y; Kasai, K; Noguchi, Y; Yamada, Y. Comparison of predictivities of log P calculation models based on experimental data for 134 simple organic compounds. QSAR Comb. Sci 2007, 26, 109–116. [Google Scholar]
  2. Hughes, LD; Palmer, DS; Nigsch, F; Mitchell, JBO. Why are some properties more difficult to predict than others? A study of QSPR models of solubility, melting point, and log P. J. Chem. Inf. Model 2008, 48, 220–232. [Google Scholar]
  3. Meylan, WM; Howard, PH. Estimating log P with atom/fragments and water solubility with log P. Perspect. Drug Discov 2000, 19, 67–84. [Google Scholar]
  4. Mannhold, R; Rekker, RF. The hydrophobic fragmental constant approach for calculating log P in octanol/water and aliphatic hydrocarbon/water systems. Perspect. Drug Discov 2000, 18, 1–18. [Google Scholar]
  5. Mannhold, R; van de Waterbeemd, H. Substructure and whole molecule approaches for calculating log P. J. Comput. Aided Mol. Des 2001, 15, 337–354. [Google Scholar]
  6. Duchowicz, PR; Castro, EA; Toropov, AA; Nesterova, AI; Nabiev, OM. QSPR Modeling of the octanol/water partition coefficient of alcohols by means of optimization of correlation weights of local graph invariants. J. Argent. Chem. Soc 2004, 92, 29–42. [Google Scholar]
  7. Spafiu, F; Mischie, A; Ionita, P; Beteringhe, A; Constantinescu, T; Balaban, AT. New alternatives for estimating the octanol/water partition coefficient and water solubility for volatile organic compounds using GLC data (Kovàts retention indices). ARKIVOC 2009, 2009, 174–194. [Google Scholar]
  8. Mannhold, R; Poda, GI; Ostermann, C; Tetko, IV. Calculation of molecular lipophilicity: State-of-the-art and comparison of log P methods on more than 96,000 compounds. J. Pharm. Sci 2009, 98, 861–893. [Google Scholar]
  9. Fujita, T; Iwasa, J; Hansch, CJ. A new substituent constant, π, derived from partition coefficients. Am. Chem. Soc 1964, 86, 5175–5180. [Google Scholar]
  10. Hansch, C; Quinlan, JE; Lawrence, GL. The linear free-energy relationship between partition coefficients and aqueous solubility of organic liquids. Linear Free-Energy Relat 1968, 33, 347–350. [Google Scholar]
  11. Rekker, RF; Kort, HMD. Hidrophobic fragmental constant—Extension to a 1000 data point set. Eur. J. Med. Chem 1979, 14, 479–488. [Google Scholar]
  12. Ghose, AK; Crippen, GM. Atomic physicochemical parameters for three-dimensional-structure-directed quantitative structure-activity relationships. 2. Modeling dispersive and hydrophobic interactions. J. Chem. Inf. Comput. Sci 1987, 27, 21–35. [Google Scholar]
  13. Katritzky, AR; Petrukhin, R; Tatham, D. Interpretation of quantitative structure-property and activity relationships. J. Chem. Inf. Comput. Sci 2001, 41, 679–685. [Google Scholar]
  14. Devillers, J; Balaban, AT. Topological Indices and Related Descriptors in QSAR and QSPR; Gordon and Breach Science Publishers: Amsterdam, The Netherlands, 1999. [Google Scholar]
  15. Karelson, M. Molecular descriptors in QSAR/QSPR; Wiley-Interscience: New York, NY, USA, 2000. [Google Scholar]
  16. Todeschini, R; Consonni, V. Molecular Descriptors for Chemoinformatics, 2nd ed; Wiley-VCH: Weinheim, Germany, 2009. [Google Scholar]
  17. Héberger, K. Quantitative structure-(chromatographic) retention relationships. J. Chromatogr. A 2007, 1158, 273–305. [Google Scholar]
  18. Kier, LB; Hall, LH. Molecular Connectivity and Structure-Activity Analysis; Wiley-VCH: New York, NY, USA, 1986. [Google Scholar]
  19. Mihalic, M; Trinajstic, N. A graph-theoretical approach to structure-property relationships. J. Chem. Educ 1992, 69, 701–712. [Google Scholar]
  20. Ren, B. Novel atomic-level-based AI topological descriptors: Application to QSPR/QSAR modeling. J. Chem. Inf. Comput. Sci 2002, 42, 858–868. [Google Scholar]
  21. Randic, M. On interpretation of well-known topological indices. J. Chem. Inf. Comput. Sci 2001, 41, 550–560. [Google Scholar]
  22. Randic, M. On history of the Randic index and emerging hostility toward chemical graph theory. Match Commun. Math. Comput. Chem 2008, 59, 5–124. [Google Scholar]
  23. Souza, ES; Junkes, BS; Kuhnen, CA; Yunes, RA; Heinzen, VEF. On a new semi-empirical electrotopological index for QSRR models. J. Chemom 2008, 22, 378–384. [Google Scholar]
  24. Souza, ES; Kuhnen, CA; Junkes, BS; Yunes, RA; Heinzen, VEF. Modelling the semi-empirical electrotopological index in QSPR studies for aldehydes and ketones. J. Chemom 2009, 23, 229–235. [Google Scholar]
  25. Souza, ES; Kuhnen, CA; Junkes, BS; Yunes, RA; Heinzen, VEF. Quantitative structure-retention relationship modelling of esters on stationary phases of different polarity. J. Mol. Graph. Model 2009, 28, 20–27. [Google Scholar]
  26. Souza, ES; Kuhnen, CA; Junkes, BS; Yunes, RA; Heinzen, VEF. Development of semi-empirical electrotopological index using the atomic charge in QSPR/QSRR models for alcohols. J. Chemom 2010, 24, 149–157. [Google Scholar]
  27. MicroCal Origin, version 5; Microcal Software: Northampton, MA, USA, 1997.
  28. TsarTM 33 for windows; Oxford Molecular: Cambrige, UK, 2000.
  29. HyperChem for Windows, Release 7.01; serial number 12-701-150170036; Hypercube: Gainesvile, FL, USA, 2002.
  30. Ghose, AK; Viswanadhan, VN; Wendoloski, JJ. Prediction of hydrophobic (lipophilic) properties of small organic molecules using fragmental methods: An analysis of ALOGP and CLOGP methods. J. Phys. Chem. A 1998, 102, 3762–3772. [Google Scholar]
  31. Wildman, SA; Crippen, GM. Prediction of physicochemical properties by atomic contributions. J. Chem. Inf. Comput. Sci 1999, 39, 868–873. [Google Scholar]
  32. Leo, A; Jow, PYC; Silipo, C; Hansch, C. Calculation of hydrophobic constant (log P) from π and ϕ constants. J. Med. Chem 1975, 18, 865–868. [Google Scholar]
  33. Organic Chemistry Portal, ClogP calculation, Available online: http://www.organic-chemistry.org/prog/peo accessed on 14 August 2011.
  34. Virtual Computational Chemistry Laboratory, ALOGPS 2.1 program, Available online: http://www.vcclab.org/lab/alogps/start.html accessed on 14 August 2011.
  35. Moriguchi, L; Hirono, S; Liu, Q; Nakagome, I; Matsushita, Y. Simple method of calculating octanol/water partition coefficient. Chem. Pharm. Bull 1992, 40, 127–130. [Google Scholar]
  36. Bredow, T; Jug, K. Theory and range of modern semiempirical molecular orbital methods. Theor. Chem. Acc 2005, 113, 1–14. [Google Scholar]
  37. Kikuchi, O. Systematic QSAR procedures with quantum chemical descriptors. Quant. Struct. Act. Relat 1987, 6, 179–184. [Google Scholar]
  38. Smith, WB. Introduction to Theoretical Organic Chemistry and Molecular Modeling; Wiley-VCH: New York, NY, USA, 1996. [Google Scholar]
  39. Liu, F; Cao, C; Cheng, B. A quantitative structure-property relationship (QSPR) study of aliphatic alcohols by the method od dividing the molecular structure into substructure. Int. J. Mol. Sci 2011, 12, 2448–2462. [Google Scholar]
  40. Golbraikh, A; Tropsha, A. Beware of q2! J. Mol. Graph. Model 2002, 20, 269–276. [Google Scholar]
  41. Golbraikh, A; Shen, M; Xiao, Z; Xiao, YD; Lee, KH; Tropsha, A. Rational selection of training and test sets for the development of validated QSAR models. J. Comput. Aided Mol. Des 2003, 17, 241–253. [Google Scholar]
Table 1. Semi-Empirical Electrotopological Indices (ISET), calculated values for Log P using Atomic-Contribution Methods (Ghose/Crippen and AlogP), Fragmental Method (ClogP), Topological indices (MlogP and ISET) and experimental Log P values (Log Pexp) for the studied set of compounds.
Table 1. Semi-Empirical Electrotopological Indices (ISET), calculated values for Log P using Atomic-Contribution Methods (Ghose/Crippen and AlogP), Fragmental Method (ClogP), Topological indices (MlogP and ISET) and experimental Log P values (Log Pexp) for the studied set of compounds.
No.Class of compoundsISETISET
Log P
Ghose/Crippen
Log P
AlogPClogPMlogPLog Pexp
Hydrocarbon
01Ethane1.99811.881.301.281.381.761.81
02Propane2.81482.401.691.741.842.282.36
03N-Butane3.63432.912.092.202.312.732.89
04N-Pentane4.44573.432.492.652.773.143.39
05N-Hexane5.26223.952.883.113.233.524.00
06N-Heptane6.07874.463.283.573.703.874.50
07N-Octane6.89524.983.674.024.164.205.15
08N-Nonane7.71175.494.074.484.634.525.65
09N-Decane8.52826.014.474.935.094.826.25
10N-Undecane9.34476.534.865.395.555.116.54
11N-Dodecane10.16127.045.265.856.025.406.80
12N-Tridecane10.97777.565.666.306.485.677.50
13N-Tetradecane11.79428.086.056.766.955.938.00
142-Methylpropane3.54212.862.021.992.182.732.76
153-Methylheptane6.76414.893.613.364.043.87
162.4-Dimethylpentane5.84554.313.153.163.453.87
17Ethene2.02941.201.130.951.150.701.13
18Propene2.80821.741.481.351.551.221.77
191-Butene3.58482.281.871.812.011.672.40
201-Pentene4.39962.842.272.262.482.082.80
211-Hexene5.21403.402.672.722.942.463.40
221-Heptene6.03053.963.063.173.402.813.99
231-Octene6.86064.533.463.633.873.154.57
24E-2-Octene6.79394.493.413.583.803.154.44
252-Ethylhexene6.56144.333.223.573.353.154.31
Aldehyde
01Acetaldehyde3.3967−0.23−0.58−0.180.43−0.32−0.22
02Propionaldehyde4.18660.270.050.480.890.200.30
03Butyraldehyde5.00520.790.440.941.360.650.83
04Hexanal6.65081.851.241.852.281.441.89
05Heptanal7.47092.381.632.312.751.792.42
06Octanal8.28592.892.032.773.213.042.90
072-Methyl-1-Propanal5.65190.730.610.951.230.650.77
08E-2-Butenal3.80570.600.520.921.000.550.52
09E-2-Hexenal5.44661.681.321.831.931.341.58
Ketone
01Acetone4.0158−0.080.38−0.240.740.20−0.24
022-Butanone4.59520.301.010.421.210.650.29
032-Pentanone5.39870.841.400.881.671.060.91
042-Hexanone6.19871.381.801.342.141.441.38
052-Heptanone7.00801.922.201.792.601.791.98
062-Octanone7.83062.482.592.253.062.132.37
072-Nonanone8.64583.022.992.703.533.363.14
082-Decanone9.45833.573.393.163.993.663.73
092-Undecanone10.27064.113.783.624.463.954.09
102-Dodecanone11.08724.664.184.074.924.234.55
113-Pentanone5.39000.841.641.091.671.060.99
123-Methyl-2-Butanone5.22580.731.570.881.551.060.84
134-Methyl-2-Pentanone6.04841.281.731.132.011.441.31
145-Nonanone8.58852.983.222.913.532.452.88
153-Hexanone6.19311.372.031.552.141.441.45
162.2 -Dimethyl-3 Butanone5.80391.112.241.302.061.441.20
175-Methyl-2-Hexanone6.88151.842.131.592.481.791.88
185-Methyl-2-Octanone8.51822.942.922.503.402.452.92
192.2.4.4-Tretramethyl-3-3-Pentanone7.77892.444.092.852.052.453.00
203-Methyl-2-Pentanone6.07461.301.971.342.011.44
214-Methyl-3-Pentanone6.02271.262.201.552.011.44
224-Heptanone7.01301.932.432.002.601.79
232.4-Dimethyl-3-Pentanone6.66291.692.762.022.351.79
Ester
01Methyl Acetate5.20560.20−0.140.020.480.130.18
02Ethyl Acetate5.95660.720.210.370.910.590.73
032-Methylbutyl Acetate8.15802.231.471.672.181.732.29
04Propyl Acetate6.82151.310.670.891.381.001.24
05Butyl Acetate7.64801.881.071.351.841.371.82
063-Methylbutyl Acetate8.10122.191.401.602.181.732.25
07Propyl Butyrate8.30842.341.702.022.311.732.15
08Methyl Propionate5.96120.720.490.690.940.590.82
09Propyl Formate6.03870.770.470.851.110.590.83
10Isobutyl Isobutyrate8.56642.512.272.342.522.062.48
11Isopentyl Isovalerate9.99073.502.762.893.452.683.62
12Methyl Butyrate6.77031.270.891.141.411.001.29
13Methyl Isopentanoate7.23461.601.221.401.751.371.82
14Methyl Decanoate11.71314.553.373.884.193.884.41
15Ethyl Formate5.213850.200.00.320.640.13
16Isopropyl Acetate6.32100.970.620.751.321.00
17Isobutyl Acetate4.28721.691.081.211.721.37
18Ethyl Butyrate7.52621.801.231.491.841.37
19Ethyl Valerate8.30372.331.631.952.311.73
20Ethyl Hexanoate9.11002.892.022.402.772.06
21Ethyl Heptanoate9.93223.462.422.863.232.38
22Ethyl Octanoate10.74244.022.823.323.73.59
23Ethyl Nonanoate11.55224.583.213.774.163.88
24Ethyl Decanoate12.38025.153.614.234.434.16
Alcohol
01Ethanol5.0258−0.030.08−0.010.43−0.17−0.31
021-Propanol5.83870.480.550.510.890.350.34
031-Butanol6.63710.990.940.971.350.800.84
041-Pentanol7.45331.511.341.431.821.211.40
051-hexanol8.26262.031.731.882.281.592.03
061-Heptanol9.08082.552.132.342.741.942.34
071-Octanol9.89133.072.532.803.213.193.15
081-Nonanol10.71013.602.923.253.673.503.57
091-Decanol11.51994.113.323.714.143.814.01
101-Dodecanol13.14995.154.114.625.074.385.13
111-Tetradecanol14.77916.204.915.535.994.916.11
121-Pentadecanol15.59866.725.305.996.465.176.64
131-Hexadecanol16.40917.245.706.456.925.427.17
141-octadecanol18.0398.286.497.367.855.908.22
152-Propanol5.17640.0610.490.370.830.350.05
162-Butanol6.13840.670.960.891.290.800.61
172-pentanol6.87131.141.361.351.761.211.14
182-Hexanol7.69361.671.751.802.221.591.61
192-Heptanol8.51362.192.152.262.681.942.31
202-Octanol9.33132.712.542.723.152.272.84
212-Nonanol10.14903.242.943.173.613.503.36
223-Pentanol6.92411.171.431.421.761.211.14
233-hexanol7.73341.691.821.872.221.591.61
243-Heptanol8.53392.202.222.332.681.942.31
253-Nonanol10.15943.243.013.243.612.593.36
264-Heptanol8.42772.142.222.332.681.942.31
274-Nonanol10.07073.193.013.243.612.593.36
285-Nonanol10.05793.183.013.243.612.593.36
292-Methyl-1-propanol6.71181.041.340.831.230.800.65
302-Methyl-1-pentanol8.08891.921.741.752.161.591.78
312-Methyl-2-propanol5.64390.360.570.570.980.800.37
322-Methyl-2-butanol6.40880.851.041.101.441.210.89
332-Methyl-2-pentanol7.21841.361.431.551.911.591.39
342-Methyl-2-hexanol8.01851.871.832.012.371.941.84
352-Methyl-3-pentanol7.62381.621.831.742.101.591.67
363-Methyl-1-butanol7.32891.431.271.221.691.211.42
373-Methyl-2-butanol6.72231.051.361.211.631.211.14
383-Methyl-2-pentanol7.56161.581.761.672.101.591.67
393-Methyl-3-pentanol7.19231.351.511.621.911.591.39
403-Methyl-3-hexanol7.99931.861.902.082.371.941.87
414-Methyl-1-pentanol8.14571.961.671.682.161.591.78
424-Methyl-2-pentanol7.59711.601.691.602.101.591.67
435-Methyl-2-hexanol8.40422.122.082.062.561.942.19
442-Ethyl-1-butanol8.06371.901.741.752.161.591.78
452-Ethyl-1-hexanol9.68832.942.532.663.082.272.84
463-Ethyl-3-pentanol7.99411.861.972.142.371.941.87
472.2-Dimethyl-1-propanol6.83191.111.451.111.741.211.36
482.2-Dimethyl-1-butanol7.62251.621.851.562.211.591.57
492.2-Dimethyl-1-pentanol8.02001.872.252.022.671.942.39
502.2-Dimethyl-3-pentanol7.92201.812.342.012.611.942.27
512.3-Dimethyl-1-butanol7.77521.721.681.542.031.591.17
522.3-Dimethyl-2-butanol7.11131.291.441.421.781.591.17
532.3-Dimethyl-2-pentanol7.92541.811.841.872.251.942.27
542.4-Dimethyl-1-pentanol8.77382.362.072.002.501.942.19
552.4-Dimethyl-2-pentanol7.77121.7271.761.802.251.941.67
562.4-Dimethyl-3-pentanol8.09971.932.232.052.441.942.31
572.6-Dimethyl-4-heptanol9.85773.052.882.833.362.593.13
583.3-Dimethyl-1-butanol7.34561.441.711.432.211.591.57
593.3-Dimethyl-2-butanol7.25311.381.871.492.151.591.19
602.2.3-Trimethyl-3-pentanol8.23832.012.412.212.762.271.99
Table 2. The coefficients a and b (Y = a + bX) and statistical parameters (r2, r, F, s, rcv 2) for linear regressions between experimental and calculated Log P values using different methods (Ghose/Crippen Log P, AlogP, MlogP, ClogP, and ISET Log P) for each class of compounds studied (according to Table 1).
Table 2. The coefficients a and b (Y = a + bX) and statistical parameters (r2, r, F, s, rcv 2) for linear regressions between experimental and calculated Log P values using different methods (Ghose/Crippen Log P, AlogP, MlogP, ClogP, and ISET Log P) for each class of compounds studied (according to Table 1).
ClassMethodNabr2rFsrcv2
HydrocarbonGhose/Crippen Log P23−0.07401.35590.99250.99622760.80.16940.9907
AlogP230.30801.15540.99520.99764345.40.13520.9940
ClogP230.14511.15130.99230.99612694.00.17150.9904
MlogP23−0.09231.29530.95650.9780462.20.40660.9494
ISET Log P230.00390.99970.99710.998672890.10450.9964
AlcoholGhose/Crippen Log P60−0.66511.36230.98220.99113202.80.21960.9813
AlogP60−0.30381.16000.98970.99495592.70.16680.9893
ClogP60−0.79661.15500.99140.99576651.40.15310.9910
MlogP60−0.46661.33440.96110.98031431.60.32490.9561
ISET Log P603,24820,63940.98760.99384612.60.18350.9870
AldehydeGhose/Crippen Log P90.22431.23570.95390.9767145.00.23180.9134
AlogP9−0.22361.09540.97890.9894324.60.16110.9613
ClogP9−0.65331.11870.99790.99903388.80.05030.9966
MlogP90.16681.01590.94890.9741130.00.25660.8469
ISET Log P90.00161.00140.99720.99862525.90.05830.9961
KetoneGhose/Crippen Log P19−0.84841.20970.91880.9585192.30.38610.8867
AlogP19−0.12991.14940.98620.99311213.40.15930.9829
ClogP19−0.84791.11320.91150.9547175.10.40310.8974
MlogP19−0.25861.14540.96940.9846538.80.23700.9622
ISET Log P19−2.71820.66930.98640.99321229.70.15820.9831
EsterGhose/Crippen Log P140.38941.14720.96880.9843372.90.21240.9573
AlogP140.18151.10800.96810.9839364.70.21470.9590
ClogP14−0.30541.13340.99430.99712076.60.09120.9928
MlogP140.13701.17420.98510.9925791.60.14700.9630
ISET Log P14−3.15750.65870.99030.99511222.90.11860.9838

a = intercept; b = slope; r2 = coefficient of determination; r = correlation coefficient; s = standard deviation; rcv 2 = cross-validation coefficient; F = null hypothesis test (F-test).

Table 3. Difference between experimental and predicted Log P (ΔLog P) using ISET and the different methods studied (Ghose/Crippen, AlogP, MlogP, ClogP) for external group of alcohols.
Table 3. Difference between experimental and predicted Log P (ΔLog P) using ISET and the different methods studied (Ghose/Crippen, AlogP, MlogP, ClogP) for external group of alcohols.
No.CompoundsLog PexpISETΔISET
Log P
ΔGhose/Crippen
Log P
ΔAlogPΔClogPΔMlogP
011-Undecanol4.4212.3394−0.220.70.26−0.180.32
022-Undecanol4.4211.78160.140.60.33−0.120.32
034-Octanol2.689.25040.020.06−0.1−0.470.41
042-Methyl-1-butanol1.147.2774−0.26−0.2−0.15−0.55−0.07
052-Methyl-3-hexanol2.198.26670.16−0.040−0.370.25
062.3-Dimethyl-3-pentanol1.677.78−0.05−0.24−0.27−0.58−0.27
074.4-Dimethyl-1-pentanol2.398.68150.090.290.51−0.280.45
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top