Determination of Abraham Model Solute Descriptors for 62 Additional C 10 through C 13 Methyl-and Ethyl-Branched Alkanes

: Abraham model solute descriptors are reported for the ﬁrst time for 62 additional C 10 through C 13 methyl-and ethyl-branched alkanes. The numerical values were determined using published gas chromatographic retention Kov á ts retention indices for 157 alkane solutes eluted from a squalane stationary phase column. The 95 alkane solutes that have known descriptor values were used to construct the Abraham model KRI versus L -solute descriptor correlation needed in our calculations. The calculated solute descriptors can be used in conjunction with previously published Abraham model correlations to predict a wide range of important physico-chemical and biological properties. The predictive computations are illustrated by estimating the air-to-polydimethylsiloxane partition coefﬁcient for each of the 157 alkane solutes.


Introduction
Linear free energy relationships (LFERs) and Quantitative Structure Property Relationships (QSPRs) provide a convenient means to estimate physical and thermodynamic properties in the absence of direct experimental data. Predictive expressions have been developed for a wide range of properties including surface tensions, vapor pressures, boiling point and melting point temperatures, chromatographic retention indices, partition coefficients and enthalpies of solvation. The more successful methods not only provide reasonably accurate estimates of the desired property, but also further our understanding of the molecular interactions and structural features that govern the property of the specific molecule or specific solute-solvent combination under consideration. It is only by thoroughly understanding a process that one obtains the knowledge necessary to hopefully develop a more comprehensive predictive method.
The particular model that we [1][2][3][4] have been promoting during the last 20 years is commonly referred to as the Abraham solvation parameter method [5][6][7][8] that was originally developed to describe solute transfer between two condensed phases: Solute Property = e eq 1 × E + s eq 1 × S + a eq 1 × A + b eq 1 × B + v eq 1 × V + c eq 1 (1) and solute transfer from the gas phase into a condensed phase: Solute Property = e eq 2 × E + s eq 2 × S + a eq 2 × A + b eq 2 × B + l eq 2 × L + c eq 2 (2) Logarithms of the water-to-organic solvent, log P, and gas-to-organic solvent partition coefficients, log K, were among the first properties to be correlated. The model was subsequently extended to other solute transfer processes, such as molar solubility ratios [1][2][3][4], blood-to-tissue and gas-to-tissue partition coefficients [9,10], gas-liquid chromatographic Abraham model solute descriptors are currently known for all but 62 of the methyl-and ethyl-branched alkanes given in the paper.

Computational Methodology for Calculation of Abraham Model Solute Descriptors
Normally the determination of Abraham model solute descriptors involves constructing a series of mathematical expressions for the measured solute properties of the given solute in a series of solvents and/or for a series of processes for which the lowercase equation coefficients are known. The computational procedure is greatly simplified for the compounds considered in the current study (see Table 1 for the complete list of alkane solutes) because four of the solute descriptors are equal to zero. In other words, E = 0, S = 0, A = 0 and B = 0. Methyl-and ethyl-branched alkane solutes possess no excess molar refraction (E = 0) or polarity/polarizability (S = 0), and are not capable of hydrogen-bond formation (A = 0 and B = 0) with the surrounding solvent molecules. The V solute descriptor is readily calculated from the solute's molecular structure, the atomic volumes of the constituent atoms contained in the solute molecule and the number of chemical bonds in the solute molecule as described by Abraham and McGowan [35]. The calculated V solute descriptors for the four molecular formulas are: V = 1.5175 for C 10  Isothermal chromatographic retention is often described in terms of either the retention factor [11,36]: or the Kováts retention index, KRI [37]: where t r(A) denotes the retention time of analyte A, t m refers to the so-called "hold-up" time measured by an unretained compound on the column and t r(z2) and t r(z1) are the retention times of two linear alkanes having z 2 and z 1 carbon atoms, respectively. Poole and coworkers [12,13,[38][39][40] have reported Abraham model correlations for describing the elution behavior of organic solutes, in terms of log k (A) , on a wide range of gas chromatographic stationary phase liquids, and for a wide range of HPLC stationary-mobile combinations, at different temperatures. While the published correlations are extremely useful if one wishes to predict and/or analyze retention factor values, the published gas chromatographic data that we wish to analyze is given in terms of KRI values. We provide the basis for using the Abraham model to describe KRI values through the following mathematical manipulations. First, we substitute into Equation (2) the numerical descriptor values of the solutes: log k solute = e eq 2 × E solute + s eq 2 × S solute + a eq 2 × A solute + b eq 2 × B solute + l eq 2 × L solute + c eq 2 (5) and the numerical values of the two reference linear alkanes: log k z2 = l eq 2 × L z2 + c eq 2 (7) Remember that the E, S, A and B solute descriptors of the reference linear alkanes are equal to zero, so only the c eq 2 and l eq 2 terms contribute. A combination of the Equations (4)-(7) yields the following expression: e eq 2 E + s eq 2 S + a eq 2 A + b eq 2 B + l eq 2 L − l eq 2 L z1 (8) which upon suitable algebraic rearrangement will give a mathematical form: KRI = e eq 9 × E + s eq 9 × S + a eq 9 × A + b eq 9 × B + l eq 9 × L + c eq 9 that is consistent with the Abraham model. We have changed the subscripting so as not imply that the numerical values of equation coefficients in Equation (9) are the same as those in Equation (2). Equation (9) provides the basis for the mathematical relationship between KRI and the L-solute descriptor. In Table 1 we have assembled the Kováts retention indices for the 157 linear and branched alkanes from the published paper by Heinzen and coworkers [34], along with the known descriptor values from our private database. In total we have 95 experimental values to use in developing our Abraham model KRI versus the L-solute descriptor correlation. The analysis of the numerical values in the second and third columns of Table 1 where N represents the number of experimental data points used in obtaining the linear relationship, SD gives the standard deviation of the residuals, R 2 and R 2 adjusted refer to the squared and adjusted squared correlation coefficient, respectively, and F is the Fisher Fstatistic. The calculated standard errors in the slope and intercept are given in parentheses immediately following the numerical value of the corresponding equation coefficient. Equation (10) Table 1. Solute descriptors for these 62 additional alkanes will now be added to our private database, and will be available to us in future planned studies directed towards determining descriptor values for additional compounds from published chromatographic retention data.

Calculation of Air-to-Polydimethylsiloxane Partition Coefficients
In earlier publications [41][42][43], we illustrated the prediction of the standard molar enthalpies of vaporization and the standard molar enthalpies of sublimations at 298 K of more than 100 different large mono-methylated and large poly-methylated alkanes using the newly calculated solute descriptor values. Instead of simply repeating the computational procedure using a different set of alkane molecules, we wish to calculate the logarithm of the air-to-polydimethylsiloxane partition coefficient, log KPDMS-air, using our updated Abraham model correlation: log KPDMS-air = −0.088 × Esolute + 0.493 × Ssolute + 1.056 × Asolute + 0.487 × Bsolute + 0.829 × Lsolute − 0.027 (11) based on the experimental values for 227 different organic compounds and inert gases.

Calculation of Air-to-Polydimethylsiloxane Partition Coefficients
In earlier publications [41][42][43], we illustrated the prediction of the standard molar enthalpies of vaporization and the standard molar enthalpies of sublimations at 298 K of more than 100 different large mono-methylated and large poly-methylated alkanes using the newly calculated solute descriptor values. Instead of simply repeating the computational procedure using a different set of alkane molecules, we wish to calculate the logarithm of the air-to-polydimethylsiloxane partition coefficient, log K PDMS-air , using our updated Abraham model correlation: log K PDMS-air = −0.088 × E solute + 0.493 × S solute + 1.056 × A solute + 0.487 × B solute + 0.829 × L solute − 0.027 (11) based on the experimental values for 227 different organic compounds and inert gases. Equation (11) back-calculates the observed 227 data points to within a standard deviation of the residuals of SD = 0.177 log units, which is comparable to the experimental uncertainty associated with many of the data points used in the regression analysis. The equation coefficients differ slightly from an earlier correlation reported by Sprunger and coworkers [44] based on a much smaller number of 142 solute molecules. Polydimethylsiloxane, PDMS, is a coating often found in microextraction devices used to sample and analyze total hydrocarbons present in unknown air samples. Martos and coworkers [45] reported experimental log K PDMS-air for 29 smaller C 6 -C 10 branched alkanes. The authors did not determine the experimental values for the larger C 11 -C 13 alkane molecules considered in the current study.
In the third column of Table 2 we have given the values predicted by Equation (11) using either our existing solute descriptors or the values determined in the current study. The numerical values that are tabulated in the last column were retrieved from the published chemical literature [44,45]. For several of the compounds there were multiple experimental values that were determined by independent research groups. Sometimes the independently determined experimental values differed significantly as was the case for: decane, log K PDMS-air = 3.87 [46] versus log K PDMS-air = 3.50 [44]; undecane, log K PDMS-air = 4.40 [47] versus log K PDMS-air = 3.89 [44]; and 3,3-dimethylpentane, log K PDMS-air = 3.42 [45] versus log K PDMS-air = 3.70 [48]. No attempt was made to select the experimental values that came closest to the calculated values based on Equation (11) as we wished to illustrate that predictive expressions can be used to identify possible outlier values in need of redetermination.

Summary
The current study represents a continuation of our ongoing efforts to determine experimental-based solute descriptors from measured solubility, partition coefficient and/or chromatographic retention data. Abraham model solute descriptors are reported for the first time for 62 additional C 10 through C 13 methyl-and ethyl-branched alkanes. The numerical values were determined using published gas chromatographic retention Kováts retention indices for 157 alkane solutes eluted from a squalane stationary phase column. The 95 alkane solutes that have known descriptor values were used to construct the Abraham model KRI versus L-solute descriptor correlation needed in our calculations. The calculated solute descriptors can be used in conjunction with previously published Abraham model correlations to predict a wide range of important physico-chemical and biological properties, including partition coefficients, vapor pressures, the standard molar enthalpies of vaporization and sublimation, chromatographic retention factors and retention times, nasal pungencies and eye irritation thresholds. Of the aforementioned properties, the chromatographic retention times will likely be the most useful as this can aid in the identification of compounds present in unknown chemical samples. The predictive computations are illustrated by calculating the air-to-polydimethylsiloxane partition coefficients of the 157 alkane solutes. Polydimethylsiloxane, PDMS, is a coating often found in microextraction devices used to sample and analyze total hydrocarbons present in unknown air samples.
As part of the current study, an expression was derived which shows the mathematical relationship between the equation coefficients given in the Abraham model log k expression versus those used in the KRI correlation. The derived relationship, Equation (8), provides a possible means to conveniently obtain the Abraham model KRI expression from existing log k correlations. All that is needed in the conversion is the L-solute descriptor values of the linear alkanes used in the calculation of Kováts retention indices.
We note at this time that the popularity of the Abraham solvation parameter model has recently prompted several researcher groups [49][50][51][52] to develop either group contribution, quantum chemical or machine learning models to estimate the numerical values of solute descriptors. Our experience in using the different estimation software programs is that the methods do provide fairly good estimates of the descriptor values for simple compounds; however, as the structural complexity of the solute increases, the "quality" of the estimations decreases. For example, we have previously shown that two software programs overestimate the hydrogen-bond acidity (e.g., A solute descriptor) for 4,5-dihydroxyanthraquinone-2-carboxylic acid [4], 1,4-dihydroxyanthraquinone [53] and 1,8-dihydroxyanthraquinone [53] when compared with values based on measured solubility data. The experimental-based A solute descriptor values were much smaller, and suggested the formation of strong intramolecular hydrogen bonds between the hydrogen of the -OH functional groups and the oxygen atom of the neighboring aromatic carbonyl group. The estimation methods do not appear to incorporate this structural feature and possible intramolecular hydrogen-bond formation in their calculation approach. We have further suggested that poor estimations might also result from the inadequate representation of select functional groups in the datasets used in developing/training the various group contribution, machine learning and quantum chemical methods [54,55]. This is not intended as a criticism of the estimation methods, but rather a statement that the methods are only as "good" as the datasets used in their development. Experimental-based solute descriptors need to be determined for molecules that have a greater chemical diversity and structural complexity. To paraphrase the recent comments of Poole and Atapattu [23]-the expansion of the Abraham model capabilities will stall without other researchers' participation in the determination of solute descriptors. The calculation of the solute descriptors for the 62 additional branched alkanes considered in the current communication is the first step in this endeavor.