Quantitative Retention (Structure)–Activity Relationships in Predicting the Pharmaceutical and Toxic Properties of Potential Pesticides

The micellar liquid chromatography technique and quantitative retention (structure)–activity relationships method were used to predict properties of carbamic and phenoxyacetic acids derivatives, newly synthesized in our laboratory and considered as potential pesticides. Important properties of the test substances characterizing their potential significance as pesticides as well as threats to humans were considered: the volume of distribution, the unbonded fractions, the blood–brain distribution, the rate of skin and cell permeation, the dermal absorption, the binding to human serum albumin, partitioning between water and plants’ cuticles, and the lethal dose. Pharmacokinetic and toxicity parameters were predicted as functions of the solutes’ lipophilicities and the number of hydrogen bond donors, the number of hydrogen bond acceptors, and the number of rotatable bonds. The equations that were derived were evaluated statistically and cross-validated. Important features of the molecular structure influencing the properties of the tested substances were indicated. The QSAR models that were developed had high predictive ability and high reliability in modeling the properties of the molecules that were tested. The investigations highlighted the applicability of combined chromatographic technique and QS(R)ARs in modeling the important properties of potential pesticides and reducing unethical animal testing.


Introduction
Pesticides are very important substances in the modern world. They help to increase the efficiency of agricultural production and food processing by protecting crops against bacteria, fungi and molds, insects, rodents, and weeds. Since pesticides are used in the countryside, in forests, and in cities, people are constantly exposed to contact with these substances in their diets [1][2][3]. Although scientists do not have a full understanding of the health effects of pesticide residues, there is no doubt that the use of these substances must be limited and controlled. As new pesticide-active compounds are developed, it is vitally important to be able to predict their properties, their pharmacokinetics, and toxicities at the earliest stage of the research. Although modern science makes it possible to predict in silico the properties of substances only on the basis of their molecular structure, the results of these calculations rarely are highly reliable, and they generally require experimental verification. To avoid highly unethical and costly animal testing, alternative techniques in combination with in silico modeling can be used to predict the properties of drug-like or pesticide-like compounds in screening [4,5].
Reversed-phase liquid chromatography (RPLC), both planar and column, is a technique commonly used to assess the lipophilic properties of bioactive organic substances [6][7][8]. Chromatography with stationary phases that imitate biological partitioning systems, such as an artificial membrane, phases with immobilized lipids, albumin, cholesterol, ceramides, or liposomes, allows the prediction of the lipophilic properties [6][7][8][9][10] as well as the behaviors of solutes in real biological systems, such as bound to serum albumin, skin permeation, blood-brain barrier permeability, intestinal absorption, the concentration of unbound form in the blood, and others [11][12][13][14][15][16]. Similar possibilities are offered by micellar liquid chromatography (MLC) using surfactants as components of the mobile phase. MLC is a mode of conventional RPLC using a surfactant solution above the critical micellization concentration (cmc) in the mobile phase. Under these conditions, the micelles form the so-called micellar pseudophase in the bulk phase. The surrounding bulk water or aqueous-organic mixture contains surfactant monomers in a concentration approximately equal to the cmc. Surfactant monomers modify the surface phase as a result of the hydrophobic interactions between the tail of the surfactant and the alkyl chain. Molecular interactions present in this system, i.e., solute association with the polar head of the surfactant, solute penetration into the micelle core, and solute interactions with adsorbed surfactant and alkyl chains, affect retention by three different equilibria, which are (1) the solute distribution between the micelle (micellar pseudophase) and the bulk phase, (2) the solute partition between the stationary phase modified by the surfactant and the bulk phase, and (3) the direct transfer of solute molecules between the surfactant-modified surface and the micelle [17][18][19][20][21].
Several theories have been developed that describe the retention in MLC, i.e., the effect of the concentration of the surfactant in the effluent on the retention of the solute. Foley's equation [22] is best known in lipophilicity studies, and according to Foley, the following relationship exists between the retention parameter, k, and the concentration of the surfactant in the effluent: where [M] is the total concentration of surfactant in the mobile phase minus cmc, K AM is the constant that describes solute-micelle binding, and k m is the solute retention parameter at zero micellar concentration, i.e., at surfactant monomer concentration equal to cmc. The K AM and k m parameters can be evaluated from the slope and intercept of experimental 1/k vs.
[M] relationships. Equation (1) describes a linear dependence with decreasing retention as the micelle concentration increases. This equation is valid for aqueous solutions of a surfactant or mobile phases with the same concentrations of the organic modifier. The micellar retention parameter, log k m , is considered analogous to the log k w value evaluated in RPLC. Thus, this parameter is considered a lipophilicity descriptor, and Equation (1) is a simple way to achieve the indirect determination of the lipophilic properties of compounds. It is postulated that retention in micellar chromatography depends on the hydrophobic (lipophilic), electronic, and steric features of the compounds in a similar way as many pharmacokinetic phenomena. An additional similarity is indicated by the fact that the phospholipids, cholesterol, fatty acids, and triglycerides that are present in the extracellular and intracellular fluids also form micelles with proteins.
In the studies, 15 carbamic and phenoxyacetic acids derivatives (Table 1), newly synthesized in our laboratory, and considered potential pesticides, were investigated using the column micellar liquid chromatography technique. As solutes lipophilicity descriptors there were applied k m and K AM values, calculated form Equation (1) [23]. Pharmacokinetic and toxicity parameters were predicted as functions of the solutes' lipophilicities (QRARs model) or lipophilicity and the number of hydrogen bond donors (HBD), the number of hydrogen bond acceptors (HBA), and the number of rotatable bonds (NRB) (QSARs model).

Results
The retention parameters km and KAM values are presented in Table 1. The relationship between the values was checked, and the following rectilinear relationship was obtained: log km = 0.585(0.102) + 0.932(0.044) log KAM (2) n = 15; s = 0.1612; R = 0.9861; Radj. = 0.9849; F = 456; p = 0.000000. The above confirms that both micellar parameters, i.e., log km and log KAM, could be used as alternative descriptors of the lipophilicities of compounds.
The physicochemical, pharmacokinetic, and toxicity parameters of the compounds (Table 2) are as follows: the logarithm of the partition coefficient (log P) in the n-octanol/water system, the number of hydrogen bond donors (HBD), acceptors (HBA), and rotatable bonds (NRB), molar weight (MW), topological polar surface area (TPSA) [24], the volume of distribution in the body (Vd) [25], the fraction unbonded in a brain (fu, brain), in plasma (fu, plasma) [26], and pharmacokinetic parameters describing blood-brain distribution (log BB) [26][27][28][29], the rate of permeation from aqueous solutions through skin (log Kp) [30,31], skin-water partition coefficient (log Ksc) describing dermal absorption from aqueous solutions [32,33], the rate of permeation through cell (log Kw/cell) [34], partitioning between water and serum albumin (log Pw/HSA), and binding to human serum albumin (log KHSA) [10,[35][36][37], partitioning between water and plant' cuticles (log Pw/pc) [38], and the dose causing the death of 50% of the group of mice tested after oral administration (LD50) [39,40]. These parameters describe important properties of the test substances and provide information about their potential applications as pesticides as well as potential threats to humans [41,42]

Results
The retention parameters km and KAM values are presented in Table 1. The relationship between the values was checked, and the following rectilinear relationship was obtained: The physicochemical, pharmacokinetic, and toxicity parameters of the compounds ( Table 2) are as follows: the logarithm of the partition coefficient (log P) in the n-octanol/water system, the number of hydrogen bond donors (HBD), acceptors (HBA), and rotatable bonds (NRB), molar weight (MW), topological polar surface area (TPSA) [24], the volume of distribution in the body (Vd) [25], the fraction unbonded in a brain (fu, brain), in plasma (fu, plasma) [26], and pharmacokinetic parameters describing blood-brain distribution (log BB) [26][27][28][29], the rate of permeation from aqueous solutions through skin (log Kp) [30,31], skin-water partition coefficient (log Ksc) describing dermal absorption from aqueous solutions [32,33], the rate of permeation through cell (log Kw/cell) [34], partitioning between water and serum albumin (log Pw/HSA), and binding to human serum albumin (log KHSA) [10,[35][36][37], partitioning between water and plant' cuticles (log Pw/pc) [38], and the dose causing the death of 50% of the group of mice tested after oral administration (LD50) [39,40]. These parameters describe important properties of the test substances and provide information about their potential applications as pesticides as well as potential threats to humans [41,42].

Results
The retention parameters km and KAM values are presented in Table 1. The relationship between the values was checked, and the following rectilinear relationship was obtained: The physicochemical, pharmacokinetic, and toxicity parameters of the compounds ( Table 2) are as follows: the logarithm of the partition coefficient (log P) in the n-octanol/water system, the number of hydrogen bond donors (HBD), acceptors (HBA), and rotatable bonds (NRB), molar weight (MW), topological polar surface area (TPSA) [24], the volume of distribution in the body (Vd) [25], the fraction unbonded in a brain (fu, brain), in plasma (fu, plasma) [26], and pharmacokinetic parameters describing blood-brain distribution (log BB) [26][27][28][29], the rate of permeation from aqueous solutions through skin (log Kp) [30,31], skin-water partition coefficient (log Ksc) describing dermal absorption from aqueous solutions [32,33], the rate of permeation through cell (log Kw/cell) [34], partitioning between water and serum albumin (log Pw/HSA), and binding to human serum albumin (log KHSA) [10,[35][36][37], partitioning between water and plant' cuticles (log Pw/pc) [38], and the dose causing the death of 50% of the group of mice tested after oral administration (LD50) [39,40]. These parameters describe important properties of the test substances and provide information about their potential applications as pesticides as well as potential threats to humans [41,42].

Results
The retention parameters k m and K AM values are presented in Table 1. The relationship between the values was checked, and the following rectilinear relationship was obtained: n = 15; s = 0.1612; R = 0.9861; R adj. = 0.9849; F = 456; p = 0.000000. The above confirms that both micellar parameters, i.e., log k m and log K AM , could be used as alternative descriptors of the lipophilicities of compounds.
The physicochemical, pharmacokinetic, and toxicity parameters of the compounds ( Table 2) are as follows: the logarithm of the partition coefficient (log P) in the n-octanol/ water system, the number of hydrogen bond donors (HBD), acceptors (HBA), and rotatable bonds (NRB), molar weight (MW), topological polar surface area (TPSA) [24], the volume of distribution in the body (V d ) [25], the fraction unbonded in a brain (f u, brain ), in plasma (f u, plasma ) [26], and pharmacokinetic parameters describing blood-brain distribution (log BB) [26][27][28][29], the rate of permeation from aqueous solutions through skin (log K p ) [30,31], skin-water partition coefficient (log K sc ) describing dermal absorption from aqueous solutions [32,33], the rate of permeation through cell (log K w/cell ) [34], partitioning between water and serum albumin (log P w/HSA ), and binding to human serum albumin (log K HSA ) [10,[35][36][37], partitioning between water and plant' cuticles (log P w/pc ) [38], and the dose causing the death of 50% of the group of mice tested after oral administration (LD 50 ) [39,40]. These parameters describe important properties of the test substances and provide information about their potential applications as pesticides as well as potential threats to humans [41,42].

Discussion
Based on the anticipated in silico parameters (Table 2), it is important to note that the substances that were tested met the basic requirements formulated by Lipinski as the "rule of five" [43,44]: lipophilicities expressed as log P values are not greater than 5 (with the exception of substance no. 13); molecular weights are not greater than 500 g/mole; numbers of hydrogen bonds acceptors are not greater than 10, and the numbers of hydrogen bond donors are not greater than 5. In addition, the topological polar surface areas are below 90 Å 2 , and the number of rotatable bonds is in the range of 2-5. The compounds have moderate in silico predicted V d values (V d < 7 L/kg), indicating that they do not accumulate to a significant extent in fat tissue. The highest values of V d were observed for compounds no. 12 and no. 13, the most lipophilic among all of the compounds that were tested. The values of log K HSA , which describe binding to human serum albumins, were in the range of 3.98-5.53, whereas the log P w/HSA parameters that characterized the solute partitioning between water and serum albumin were in the range of 0.200-1.823. Xenobiotics bound to plasma proteins are not active because they are not able to cross membranes and permeate the site of action nor bind to receptors. The binding to serum albumins affects the concentration of the unbonded forms of the substance in serum, and small values of free fractions are preferable in order to prevent possible side effects.
Parameters that have negative values of log BB or values close to zero (in the range of −0.275 to −0.513) suggest that the compounds that were tested will not be able to penetrate into the brain, and neurotoxicity will be diminished. Of course, low levels of penetration into the brain are desirable. Compounds no. 12 and 13 had positive values of log BB, and they had the lowest unbonded fractions in the brain (f u, brain ). Compounds no. 12 and 13, which had the highest log K HSA and log P w/HSA values, also had the lowest unbonded fractions in plasma (f u, plasma ). According to the parameters calculated in silico, these substances are characterized by the highest rate of permeation through the skin (log K p ), dermal absorption (log K sc ), and partition between water and plant cuticle (log K w/cell ), as well as the lowest rate of permeation through cells (log P w/pc ). Substance no. 14 is the most toxic, poorly bound with albumins, and its concentration in the unbound form in the brain and serum was the highest among the substances that were tested, even though the value of log BB was negative.
When considering different parameters (e.g., chromatographic retention) as lipophilicity descriptors, they should be compared with the log P values that describe solute partitioning between n-octanol and water. In our studies, the relationships between the chromatographic and partition coefficient log P are presented in Figure 1. In both cases, i.e., for log k m and log K AM, the separate relationships for group I (carbamic acid derivatives) and group II (phenoxyacetic acid derivatives) were obtained with very good linearity (R >> 0.8). They confirm both micellar parameters as lipophilicity descriptors of the compounds that were tested.

QRARs
In QRARs, it is desirable to have the methodology relationships between solute retention and the biological activity of the compounds. In our investigations, we obtained the linear relationships between the micellar parameters (log km and log KAM) and the other parameters, i.e., log Ksc and log Pw, HSA ( Figure 2) and Vd, log Pw/pc, fu, brain, and fu, plasma ( Figure  3) with very good quality (R >> 0.8). The straight lines in Figure 2A, B show a clear increase in dermal absorption and partitioning in the water-human serum albumin system of tested compounds with an increase in their lipophilicity. It should be noted that the lipophilic properties of the tested substances, based on the in silico log P parameters, are in the range of 5.386-1.345. The increase in lipophilicity in the parabolic function affects the volume of distribution (Vd) and absorption by the plant cuticle (log Pw/pc) as well as the unbonded fraction in the brain (fu, brain) and plasma (fu, plasma) (Figure 3). Although Vd and log Pw/pc increase with lipophilicity, the other parameters, i.e., fu, plasma and fu, brain, decrease.

QRARs
In QRARs, it is desirable to have the methodology relationships between solute retention and the biological activity of the compounds. In our investigations, we obtained the linear relationships between the micellar parameters (log k m and log K AM ) and the other parameters, i.e., log K sc and log P w, HSA ( Figure 2) and V d , log P w/pc , f u, brain , and f u, plasma (Figure 3) with very good quality (R >> 0.8). The straight lines in Figure 2A,B show a clear increase in dermal absorption and partitioning in the water-human serum albumin system of tested compounds with an increase in their lipophilicity. It should be noted that the lipophilic properties of the tested substances, based on the in silico log P parameters, are in the range of 5.386-1.345. The increase in lipophilicity in the parabolic function affects the volume of distribution (V d ) and absorption by the plant cuticle (log P w/pc ) as well as the unbonded fraction in the brain (f u, brain ) and plasma (f u, plasma ) ( Figure 3). Although V d and log P w/pc increase with lipophilicity, the other parameters, i.e., f u, plasma and f u, brain , decrease. The graphs suggest the existence of the optimal range of lipophilicity of the substance, for which the volume of distribution and the absorption through the epidermis are the highest, and the unbonded fractions in plasma and in the brain are the lowest. Figures 2 and 3 also indicate that lipophilicity is the dominant factor that influences (1) the absorption of the test substances through the skin and epidermis, (2) the distribution of water-albumin, (3) the size of the unbound fraction in the plasma and the brain, and (4) the volume of the distribution. The graphs suggest the existence of the optimal range of lipophilicity of the substance, for which the volume of distribution and the absorption through the epidermis are the highest, and the unbonded fractions in plasma and in the brain are the lowest. Figures 2 and 3 also indicate that lipophilicity is the dominant factor that influences (1) the absorption of the test substances through the skin and epidermis, (2) the distribution of water-albumin,

QSARs
In the QSARs methodology, we used the experimentally-derived lipophilicities (micellar parameters log km and log KAM), and the numbers of hydrogen bond donors (HBD), acceptors (HBA), and rotatable bonds (NRB) as independent variables. These values were used to predict the dependent variables, i.e., log Kp, log Kw/cell, log KHSA, log BB, and LD50. Table 3 shows the quantitative structure-activity relationships (expressed as Equations (3)-(12)) that were established. The equations were cross-validated (LOO), and all of the calculated statistics are summarized in Table 3 and presented graphically in Figures 4-8 as PLS-standardized coefficients (A), the response plots (B), and standardized residuals

QSARs
In the QSARs methodology, we used the experimentally-derived lipophilicities (micellar parameters log k m and log K AM ), and the numbers of hydrogen bond donors (HBD), acceptors (HBA), and rotatable bonds (NRB) as independent variables. These values were used to predict the dependent variables, i.e., log K p , log K w/cell , log K HSA , log BB, and LD 50 . Table 3 shows the quantitative structure-activity relationships (expressed as Equations (3)-(12)) that were established. The equations were cross-validated (LOO), and all of the calculated statistics are summarized in Table 3 and presented graphically in Figures 4-8 as PLS-standardized coefficients (A), the response plots (B), and standardized residuals vs. leverages (C). The statistical parameters allowed us to positively evaluate the derived QSAR equations. There were no significant cross-correlations between the independent variables, and the values of the variance inflation factor (VIF) were significantly lower than 5. The diagrams presented in Figures 4A, 5A, 6A, 7A and 8A show the standard coefficients of Equations (3)- (12), and they explain the direction and the strength of the impact of a given descriptor on the calculated parameters. The correlations shown in Figures 4B, 5B, 6B, 7B and 8B illustrate the relationships between the actual response (values obtained from ACD/Percepta software) and those predicted by the established QSAR models (calculated response). The applicability domains (AD) of the developed regression models were also evaluated and are visualized as the Williams plots ( Figures 4C, 5C, 6C, 7C and 8C). AD is a theoretical region in physicochemical space (the response and chemical structure space) for which a model should make predictions with a given reliability [45]. The warning leverage limits (h* = 1.0) were calculated using the following equation: Table 3. The established quantitative structure-activity relationships: n-number of observations, R-coefficient of determination, R adj. -adjusted coefficient of determination, sd-standard deviation, F-value, p-probability value, VIF-variance inflation factor, PRESS-predicted residual sum of squares, MSE-mean square error, cv-cross-validated.

No. Equation
where k is the number of descriptors used in the MLR model, and n is the number of compounds in the data set. The Williams plot can be used for graphical detection of outliers (h > h*).
The results proved that the models obtained are valid within the domain in which they were developed. The results obtained in our studies indicate the positive effect of solute lipophilicity on the skin (log K p ) (Figures 4 and 5) and cell permeation (log K w/cell ) (Figures 6 and 7) from water, binding affinity to human serum albumin (log K HSA ) (Figures 8 and 9), concentration in the brain (log BB) (Figures 10 and 11), and toxicity in mice (the decrease in LD 50 ) (Figures 12 and 13). Lipophilicity is a dominant factor for log K p , log K HSA , log BB, and LD 50 . The rates of cell permeation are strongly retarded by solute hydrogen bond acidity and rather less so by hydrogen bond basicity (Figures 6 and 7). The same effects of the compounds' acidity and basicity on skin permeation were observed (Figures 4 and 5). The number of hydrogen bond donors (HBD) also strongly reduces the substance permeation through the blood-brain barrier (Figures 10 and 11) and increases the value of the lethal dose (Figures 12 and 13). The values of LD 50 decrease and the toxicity of the solutes increase with the number of hydrogen bond acceptors. Binding to human serum albumin is strongly related to (decreased) hydrogen bond basicity (HBA) and much less dependent (increased) on its acidity (HBD) (Figures 8 and 9). Solute flexibility, as described by the NRB values, strongly increases the rate of dermal absorption (Figures 4 and 5) and binding to human serum albumin (Figures 8 and 9). It also reduces the LD 50 value, i.e., increases the toxicity of the substance (Figures 12 and 13). NRB has a slightly positive effect on cell permeation (Figures 6 and 7). Hydrogen bond basicity and solute flexibility practically do not affect the penetration of substances through the blood-brain barrier (Figures 10 and 11).
When analyzing the results, substances no. 10-15 (phenoxyacetic acid derivatives) should be indicated as the most toxic for mice, i.e., having the lowest lethal dose after oral administration. These substances are more lipophilic among those tested (log P values are in the range of 2.6-5.386, with smaller HBD (HBD ≤ 1), and greater HBA (HBA ≥ 3) values, and they have the greatest number of rotatable bonds (NRB > 3). They also have a higher concentration in the brain; with the exception of compound no. 14, all of the log BB values were greater than 0.
Summarizing the results, substances no. 12 and no. 13 can be indicated as the most interesting among those that were tested. They are the most toxic, but they are also highly bound to plasma albumin, and their free fractions in plasma and the brain are the lowest. The magnitudes of the distribution are acceptable, as they were for all of the substances that were tested. On the basis of the results that were obtained, it can be concluded that they can be considered promising pesticides as well as subjects for further, more detailed research.

In Silico Parameters
The physicochemical, structural, pharmacokinetic, and toxicity parameters of the compounds that were tested were calculated from their molecular structures using ACD/Percepta software, version 1994-2012 (ACD/Labs, Advanced Chemistry Development, Inc., Toronto, ON, Canada) ( Table 2).

Statistics
Linear regression (LR), multiple linear regression (MLR), partial last squares (PLS), and leave-one-out cross-validation (LOO) were conducted using the statistical software

Conclusions
QRARs and QSAR methodologies were successful in modeling the pharmacokinetic properties and toxicities of 15 newly synthesized compounds considered as potential pesticides. The micellar liquid chromatography technique was used to determine the lipophilicity descriptors (log k m and log K AM ) of the compounds. In the QSAR method, log k m and log K AM parameters, HBD, HBA, and NRB were applied as independent values. All of the equations that were derived were evaluated statistically as being very good. The QSAR models that were developed had high predictive ability and high reliability in modeling the properties of the molecules that were tested. The investigations highlighted the significance and possibilities of combined chromatographic techniques and QR(S)ARs in modeling the important properties of potential pesticides and reducing unethical animal testing.