IAM Chromatographic Models of Skin Permeation

Chromatographic retention factor log kIAM obtained from IAM HPLC chromatography with buffered aqueous mobile phases and calculated molecular descriptors (surface area—Sa; molar volume—VM; polar surface area—PSA; count of freely rotable bonds—FRB; H-bond acceptor count—HA; energy of the highest occupied molecular orbital—EHOMO; energy of the lowest unoccupied orbital—ELUMO; and polarizability—α) obtained for a group of 160 structurally unrelated compounds were tested in order to generate useful models of solutes’ skin permeability coefficient log Kp. It was established that log kIAM obtained in the conditions described in this study is not sufficient as a sole predictor of the skin permeability coefficient. Simple put, potentially useful models based on log kIAM and readily available calculated descriptors, accounting for 85 to 91% of the total variability, were generated using Multiple Linear Regression (MLR).The models proposed in the study were tested on a group of 20 compounds with known experimental log Kp values.


Introduction
Transepidermal absorption is an important route of chemicals' entry into a human body. The skin permeability coefficient K p is defined according to Equation (1): where K m is the partition coefficient between the stratum corneum and the vehicle; D is the effective compound's diffusion coefficient through the stratum corneum; and h is the diffusional pathlength.
The experimental values of skin permeability coefficients obtained in vivo (on human volunteers), ex vivo (on excised human skin), or even on animal models [1] are scarce and often inconsistent due to variations in properties of different skin specimen; there are also some ethical considerations related to such models. For these reasons, several in vitro or in silico skin permeation models have been developed [2]. One of the most frequently cited in silico skin permeability models, based on just two descriptors known to have a very strong influence on compounds' ability to cross biological barriers, namely lipophilicity (expressed as octanol-water partition coefficient K ow ) and molecular weight (M w ), has been proposed by Potts (Equation (2)) [3]: log K p = −2.80 + 0.66 log K ow − 0.0056 M w (2) The Potts model is widely acknowledged due to its simplicity [4], although it is criticized by some research studies because it gives erroneous results for compounds of extreme properties (very hydrophilic or lipophilic; non-hydrogen bonding or very strongly hydrogen-bonding) [5][6][7][8]. However, the predictions made using Pott's model are sufficiently good for the majority of drug-like compounds [9,10], thus this model is a popular tool in drugs' ADMET studies and calculations based on it are offered by some popular ADMET prediction software packages [11,12]. Other proposed theoretical K p models are based on the descriptors, such as the melting point, McGowan's characteristic volume, Abraham's solvation parameters or H-bonding properties (total H-bond count, H-bond acceptor count, and H-bond donor count), and total nitrogen and oxygen atom count [13][14][15][16][17][18][19][20][21][22][23]. QSAR studies of skin permeation published to date prove that transdermal absorption is a complex property and there are several factors contributing to it [4,7,[24][25][26][27][28][29].
Previous studies demonstrated the usefulness of chromatographic descriptors in skin permeability studies. Liquid chromatographic models of skin permeability are based on the notion that the solutes' partition between a stationary phase and a mobile phase resembles the partition between the skin and the vehicle.The separation techniques capable of providing chromatographic skin permeability predictors are liquid chromatography (HPLC or TLC), biopartitioning micellar chromatography, micellar electrokinetic chromatography, liposome electrokinetic chromatography, and two-dimensional gas chromatography (GCxGC) [8,[30][31][32][33][34][35][36][37][38][39][40]. Chromatographic techniques of skin permeability studies are growing in popularity because of their high throughput, low cost, and good repeatability/reproducibility (the majority of such studies are conducted on commercially available stationary phases).
In our earlier studies, we proposed models of K p based on calculated molecular parameters and RP-18 TLC-derived descriptors (R M or R M /V M ) [23,41]: where R M = log (1/R f − 1) [42] and R f values were collected on the RP-18 stationary phase with acetonitrile/pH 7.4 phosphate-buffered saline 70: 30 (v/v) as a mobile phase; log D is the distribution coefficient; PSA is the polar surface area (Å 2 ); HD is the H-bond donors count; V M is the molar volume (Å 3 ); E T is the total energy (kcal/mol); E h is the hydration energy (kcal/mol); and (N + O) is the total oxygen and nitrogen atom count. We have also studied the skin permeability of organic sunscreens using RP-18 TLCderived descriptors obtained with mobile phases containing different organic modifiers [43].
Thanks to the ability of stationary phases based on phosphatidylcholine covalently linked to aminopropyl silica to mimic the natural membrane bilayer (or, to be precise, a half of it), Immobilized Artificial Membrane (IAM) chromatography performed on such sorbents has been used to predict physico-chemical and biological properties of solutes for many years [44]. The relationships between the IAM chromatographic retention factor (k IAM ) and the skin permeability coefficient have been studied for small groups of compounds (n = 10 to 32) and the resulting dependencies are mostly univariate (linear or quadratic) [30,31,34,36], the exceptions being the study in which McGowan's characteristic volume V was incorporated in a model alongside with log k IAM [31]: Additionally, the model proposed by Barbato, based on a combination of an IAM chromatographic descriptor and the octanol-water partition coefficient log K ow , is as follows [30]: log K p = −2.136 ∆logk w IAM + 0.037 log K ow − 2.373 (n = 10, R 2 = 0.94) (6) where ∆logk w IAM is the difference between log k w IAM measured and predicted on the basis of log K ow .
In this study it was our intention to investigate the potential of immobilized artificial membrane (IAM) chromatography in skin permeability studies of a large group of solutes from different chemical families.We hoped to provide simple and practical models based on the IAM chromatographic retention factors and calculated physico-chemical descriptors that could be used to predict the skin permeability coefficient of solutes by researchers both in drug discovery and environmental toxicology fields.

Results and Discussion
The experimentally determined values of K p were available for only some drugs within the studied group.For this reason, the models of skin permeability involving IAM chromatographic and calculated descriptors were generated and validated using K p values obtained in silico with the EpiSuite software (DERMWIN v. 2 module; log K p EPI ), which is recommended by the US Environmental Protection Agency [12,45] and was tested on a sub-group of 20 solutes whose experimental log K p values are known (log K p exp ) [29]. The estimation methodology used by DERMWIN was based on the above-mentioned Equation (2) [3]. The values of log K p EPI obtained using DERMWIN are given in Table 1. Where: log k IAM -IAM HPLC retention factors [46]; log K p exp -experimental values [29]; log K p EPI -values calculated using DERMWIN software [12]; and log K p (9) and log K p (11) to log K p (13) -values calculated according to Equations (9) and (11)-(13).
The compounds 1 to 160 were chromatographed on the IAM stationary phase using buffered aqueous mobile phases as described in Section 3. The retention factors (log k IAM ) were compiled from the published literature by Sprunger et al. [46] whose main objective was to propose an IAM chromatographic retention model based on Abraham's solvation parameters. In our investigations, we focused mainly on drugs (that are or potentially could be administered transdermally) and environmentally relevant compounds (organic pollutants whose skin absorption can be a possible route of exposure). Log k IAM values taken from Reference [46] were correlated with the log K p EPI values presented in Table 1. Unfortunately, the resulting linear correlation (7) is poor, with R 2 = 0.46.
Previous studies of the relationships between log K p and log k IAM (e.g., Equation (8)) for small groups of compounds [31,34,36] failed to provide general chromatographic models applicable to molecules from different chemical classes: From this (and Equation (5) [31]), it was concluded that log k IAM obtained as described in Section 3 is not sufficient as a sole predictor of log K p . At this point, it was decided to seek a multivariate linear relationship that would meet the following requirements: (i) give the best possible fit with the log K p EPI reference values; (ii) fit the experimental log K p exp values for a subgroup of compounds whose experimental skin permeability data are available (preferably from a single source to avoid possible discrepancies between experimental data collected by different protocols); and (iii) be as simple as possible and contain the minimum number of independent variables needed to generate models of reasonable predictive power without the risk of overfitting.These goals were achieved in our study by taking the following steps: a.
Generating a well-fitting model based on a relatively large number of independent variables selected by forward stepwise multiple regression; b.
Validation of the model using two randomly selected subsets of compounds, namely a training set (n = 120) and a test set (n = 40); c.
Validation of the initial model using experimental log K p exp data for a subset of compounds (n = 20); d.
Analysis of every step of multiple stepwise regression in order to eliminate redundant independent variables; and e. Building a new model based on a reduced set of independent variables and its validation as described above.
In the first step of the multiple regression analysis, the calculated physicochemical parameters presented in Table 2 were incorporated by forward stepwise multiple regression (Equation (9)  give the best possible fit with the log Kp EPI reference values; (ii) fit the experimental log Kp exp values for a subgroup of compounds whose experimental skin permeability data are available (preferably from a single source to avoid possible discrepancies between experimental data collected by different protocols); and (iii) be as simple as possible and contain the minimum number of independent variables needed to generate models of reasonable predictive power without the risk of overfitting.These goals were achieved in our study by taking the following steps: a. Generating a well-fitting model based on a relatively large number of independent variables selected by forward stepwise multiple regression; b. Validation of the model using two randomly selected subsets of compounds, namely a training set (n = 120) and a test set (n = 40); c. Validation of the initial model using experimental log Kp exp data for a subset of compounds (n = 20); d. Analysis of every step of multiple stepwise regression in order to eliminate redundant independent variables; and e. Building a new model based on a reduced set of independent variables and its validation as described above.
In the first step of the multiple regression analysis, the calculated physicochemical parameters presented in Table 2 were incorporated by forward stepwise multiple regression (Equation (9) The model (9) was validated using the holdout method in which data points are assigned to two sets usually called the training set and the test set. The size of each set is arbitrary (the test set is usually smaller than the training set). The group of 160 studied compounds was divided into two subsets: a training set (1 to 120) and a test set (121 to  160).The Equation (10) generated for the training set and containing the same independent variables as Equation (9) is as follows: The model (9) was validated using the holdout method in which data points are assigned to two sets usually called the training set and the test set. The size of each set is arbitrary (the test set is usually smaller than the training set). The group of 160 studied compounds was divided into two subsets: a training set (1 to 120) and a test set (121 to  160).The Equation (10) generated for the training set and containing the same independent variables as Equation (9) The values of log K p (10) were calculated for the test set according to Equation (10) and plotted against the reference log K p EPI values to furnish a linear relationship (R 2 = 0.87). The model (9) was also tested on the subgroup of 20 compounds whose log K p exp values were available (16 compounds belonging to the training set and four compounds belonging to the test set).The resulting relationship between log K p (9) and log K p exp is linear, with R 2 = 0.90.
Equation (9), despite encouraging results of validation, was found unsatisfying because it seems over-parameterized; it contains nine independent variables whose contributions, apart from log k IAM , α and PSA, are negligible (log k IAM , α and PSA account for over 91% of the total variability and the remaining six variables for less than 4%).With so many independent variables, it is also difficult to avoid colinearity.A decision was made to simplify Equation (9) as much as possible; apart from log k IAM , only two variables-PSA and α-seem to have a sufficient influence on log K p to justify incorporating them in Equations (11)-(13) (Figures 2-4).
The values of log Kp (10) were calculated for the test set according to Equation (10) and plotted against the reference log Kp EPI values to furnish a linear relationship (R 2 = 0.87).The model (9) was also tested on the subgroup of 20 compounds whose log Kp exp values were available (16 compounds belonging to the training set and four compounds belonging to the test set).The resulting relationship between log Kp (9) and log Kp exp is linear, with R 2 = 0.90.
Equation (9), despite encouraging results of validation, was found unsatisfying because it seems over-parameterized; it contains nine independent variables whose contributions, apart from log kIAM, α and PSA, are negligible (log kIAM, α and PSA account for over 91% of the total variability and the remaining six variables for less than 4%).With so many independent variables, it is also difficult to avoid colinearity.A decision was made to simplify Equation (9) as much as possible; apart from log kIAM, only two variables-PSA and α-seem to have a sufficient influence on log Kp to justify incorporating them in Equations (11)-(13) (Figures 2-4).  The group of 160 studied compounds was divided into two subsets: a training set (1 to 120) and a test set (121 to 160).The Equations (14)- (16), generated for the training set and containing the same independent variables as Equations (11) The values of log Kp (14) , log Kp (15) , and log Kp (16) were calculated for the test set according to Equations (14)- (16) and plotted against the reference log Kp EPI values to furnish linear relationships (R 2 = 0.86, 0.79, and 0.83, respectively).The models (11), (12), and (13) were also tested on the subgroup of 20 compounds whose log Kp exp values were available.The resulting relationships between log Kp (11) , log Kp (12) , and log Kp (13) -as well as log Kp exp -are linear, with R 2 = 0.88, 0.84, and 0.80, respectively. The group of 160 studied compounds was divided into two subsets: a training set (1 to 120) and a test set (121 to 160).The Equations (14)- (16), generated for the training set and containing the same independent variables as Equations (11) The values of log Kp (14) , log Kp (15) , and log Kp (16) were calculated for the test set according to Equations (14)- (16) and plotted against the reference log Kp EPI values to furnish linear relationships (R 2 = 0.86, 0.79, and 0.83, respectively).The models (11), (12), and (13) were also tested on the subgroup of 20 compounds whose log Kp exp values were available.The resulting relationships between log Kp (11) , log Kp (12) , and log Kp (13) -as well as log Kp exp -are linear, with R 2 = 0.88, 0.84, and 0.80, respectively. The group of 160 studied compounds was divided into two subsets: a training set (1 to 120) and a test set (121 to 160). The Equations (14)- (16), generated for the training set and containing the same independent variables as Equations (11) The values of log K p (14) , log K p (15) , and log K p (16) were calculated for the test set according to Equations (14)- (16) and plotted against the reference log K p EPI values to furnish linear relationships (R 2 = 0.86, 0.79, and 0.83, respectively).The models (11), (12), and (13) were also tested on the subgroup of 20 compounds whose log K p exp values were available.The resulting relationships between log K p (11) , log K p (12) , and log K p (13) -as well as log K p exp -are linear, with R 2 = 0.88, 0.84, and 0.80, respectively. Equation (13) involves log k IAM , which encodes some important properties responsible for drugs' absorption (lipophilicity and molecular size), but accounts for only 46% of the total log K p variability and PSA, which is known to influence other absorption phenomena (e.g., transport through the blood-brain barrier and uptake from a gastrointestinal tract) [47][48][49]. A coefficient for PSA in Equation (13) is negative, which suggests (as already reported, e.g., for the blood-brain barrier passage or oral absorption) that compounds with large polar surface areas are not easily absorbed through skin.However, from the statistical point of view, Equation (11) is superior to Equations (12) and (13), and gives results comparable to those obtained using Equation (9) without the risk of overfitting.

IAM Chromatography
The chromatographic retention factors (log k IAM ) for the compounds analyzed in this study were compiled by Sprunger et al. [46]. They were obtained on a IAM.PC.DD2 HPLC column using an aqueous mobile phase buffered at pH ≤ 3 for carboxylic acids and in the pH range of 6.5 to 7.5 for other compounds.

Calculated Molecular Descriptors
The molecular descriptors for the compounds investigated during this study were calculated with HyperChem 8.0 utilizing the PM3 semi-empirical method with Polak-Ribiere's algorithm: total dipole moment-DM (D), surface area (grid)-S a (Å 2 ), molecular weight-M w (g mol −1 ), energy of the highest occupied molecular orbital-E HOMO (eV), and energy of the lowest unoccupied molecular orbital-E LUMO (eV). Other physicochemical parameters (octanol-water partition coefficient-log P, polar surface area-PSA (Å 2 ), H-bond donor count-HD, H-bond acceptor count-HA, polarizability-α (cm 3 ), molar volume-V M (cm 3 ), and freely rotable bond count-FRB) were calculated using ACD/Labs 8.0 software. (N+O), which is the total nitrogen and oxygen atom count, was calculated from the molecular structures. The relevant calculated molecular descriptors are given in Table 2. Statistical analysis was done using Statistica v.13 or StatistiXL v.2.

Conclusions
Multiple regression models for predicting the skin permeability coefficient of structurally diverse compounds were developed.
Due to the limited availability of experimental permeability data for the solutes investigated in this study, the reference skin permeability coefficients were calculated according to a widely accepted theoretical model proposed by Potts. The values of log K p obtained using this model are in good agreement with the experimental data for druglike compounds.
The newly developed log K p models are based on a set of IAM chromatographic and computational descriptors.The main descriptors, responsible for the variability of log K p in MLR equations, are log k IAM , polarizability (α), and polar surface area (PSA), and the MLR model gains very little in predictive power when other descriptors are incorporated. Linear relationships based on the IAM chromatographic retention factor and readily available calculated physico-chemical parameters give very good results and have the benefit of simplicity. The proposed models may be applied during the early steps of the drug discovery process when different drugs' physico-chemical and biological properties are often studied in vitro using IAM chromatography and rapid predictions are required. IAM chromatographic and computational studies of the skin permeability of compounds may therefore be of interest to pharmaceutical and medicinal chemists, and to researchers in the area of environmental sciences since many compounds of environmental concern are absorbed through skin.