Predicting Pharmacokinetic Properties of Potential Anticancer Agents via Their Chromatographic Behavior on Different Reversed Phase Materials

The Quantitative Structure-Activity Relationship (QSAR) methodology was used to predict biological properties, i.e., the blood–brain distribution (log BB), fraction unbounded in the brain (fu,brain), water-skin permeation (log Kp), binding to human plasma proteins (log Ka,HSA), and intestinal permeability (Caco-2), for three classes of fused azaisocytosine-containing congeners that were considered and tested as promising drug candidates. The compounds were characterized by lipophilic, structural, and electronic descriptors, i.e., chromatographic retention, topological polar surface area, polarizability, and molecular weight. Different reversed-phase liquid chromatography techniques were used to determine the chromatographic lipophilicity of the compounds that were tested, i.e., micellar liquid chromatography (MLC) with the ODS-2 column and polyoxyethylene lauryl ether (Brij 35) as the effluent component, an immobilized artificial membrane (IAM) chromatography with phosphatidylcholine column (IAM.PC.DD2) and chromatography with end-capped octadecylsilyl (ODS) column using aqueous solutions of acetonitrile as the mobile phases. Using multiple linear regression, we derived the statistically significant quantitative structure-activity relationships. All these QSAR equations were validated and were found to be very good. The investigations highlight the significance and possibilities of liquid chromatographic techniques with three different reversed-phase materials and QSARs methods in predicting the pharmacokinetic properties of our important organic compounds and reducing unethical animal testing.


Introduction
The use of various chromatographic techniques in supporting the drug discovery process and in physicochemical research has become quite extensive. The search for new biologically active substances, considered as potential drugs or plant protection products, is an important task in modern science. The goals are to improve people's quality of life and their life expectancy and to increase agricultural productivity while ensuring diversity and protecting the environment. One way to achieve the above goal is to synthesize new compounds that have the desired properties. Since the 19th century, it has been known that the properties of chemical substances are closely related to their molecular structures. The intensive development of the Quantitative Structure-Activity Relationships (QSARs) method began in the 1960s and continues today [1][2][3]. In this method, searches are conducted to identify the multidimensional relationships that exist between the biological properties and structural parameters for a group of congeneric compounds. The derived mathematical model can be extended to new compounds with similar structures and used to predict their biological properties. In this way, it is possible to design new molecules that have the desired properties. The model becomes the basis for making decisions concerning the synthesis of new compounds, which allows researchers to limit the time and cost associated with their research. In addition, the interpretation of a mathematical model can lead to an overall model of a given biological property, which provides information that can be used to obtain the optimal design of desired chemical substances.
The relationship between solute activity and the parameters describing its molecular properties can be reported as a multiple linear regression (MLR) [2][3][4]: Activity = aA + bB + cC + . . . + const = f (lipophilic, electronic, steric properties) (1) where a, b, c, and so on are the correlation factors. The molecular descriptors (A, B, C, . . . ) relating to the lipophilic, electronic, and steric properties of the molecule can be determined experimentally or evaluated in silico. Currently, there are many software products on the market that allow such calculations, e.g., HyperChem, ACD/ChemSketch, ACD/LADME, and SciLogP.
The lipophilicity of a bioactive compound is usually expressed by the logarithm of the partition coefficient in an n-octanol/water system and is either measured experimentally by the "shake-flask" method (log P o/w ) or evaluated in silico using different algorithms (fragment, atomic, molecular, or combined atomic-fragment) from molecular structures. Retention parameters, especially log k w values, measured by a column or by planar Reversed-Phase Liquid Chromatography (RPLC), are the most popular, and they are accepted as chromatographic lipophilicity descriptors by the Organization of Economic Co-operation and Development. In addition, liquid chromatography techniques are very popular as indirect in vitro methods for the determination of lipophilicity [5][6][7]. The chromatographic methods used to assess lipophilicity have significant advantages, e.g., simplicity and reduced experimental time, good reproducibility, process automation, no need for quantitative analysis, and small amounts of samples required. The chromatographic methods also provide independent measurements of the low solubility of the compound and measurements of impurities or degradation products. However, some limitations of the RPLC method have also been noted. The most important of these limitations are: (1) Insufficient modeling of the n-octanol-water system for structurally diverse compounds, (2) the effect of the pore size and possible interactions with the residual silanol groups on the surface that do not occur in the n-octanol-water partitioning system, (3) time-consuming isocratic measurements, and (4) a limited working range of pH. Some solutions have been developed in the last few decades to overcome these limitations. First, it is worth mentioning the novel types of columns that were designed to mimic the n-octanol-water system (e.g., polymeric reversed-phase columns (PLRPs) or polymer-based columns) or to mimic biological partitioning, e.g., immobilized artificial membranes (IAMs) or columns with immobilized cholesterol, human or rat serum albumins, glycoproteins, and others [8][9][10]. Modifying the mobile phase is another solution. Such possibilities offer Micellar Liquid Chromatography (MLC) using surfactants as components of the mobile phase [5,[11][12][13] and ionic liquids as effluents [14]. A specific type of micellar chromatography is Biopartitioning Micellar Chromatography (BMC), in which Brij 35 and a low concentration of alcohol, if necessary, are added to the mobile phase. The hydrophilic/hydrophobic nature of the surfactant in the modified stationary phase structurally resembles the ordered array of the hydrocarbon chains in the biomembranes. In addition, the surfactant and alcohol that are present in the mobile phase provide the opportunity for hydrogen bonds to form. This technique is usually referred to as Biopartitioning Micellar Chromatography [15,16] due to its similarity with biological barriers and extracellular fluids. An important advantage of MLC is that it meets the recommendations of green chemistry by limiting the consumption of organic solvents [17].
The most important biological properties of substances considered as potential drugs are their ability to bind blood proteins (albumin and alpha-1-acid glycoprotein); penetrate biological barriers, e.g., the blood-brain barrier (BBB); permeate the skin; and perform intestinal absorption. When entering into plasma, most compounds bind rapidly to the constituents of blood, but the concentration of a free drug is responsible for the pharmacological activity, safety, and distribution of the tissue. So, the extent of protein binding in plasma affects the pharmacokinetic characteristics of a compound, i.e., its clearance, volume of distribution, half-life, drug-drug interaction, and pharmacological efficacy. Agents intended to interact with the central nervous system must be able to cross the BBB, and satisfactory transport through the blood-brain barrier is an essential prerequisite for a potential drug to affect the central nervous system. However, in order to avoid side effects, the agents that act peripherally should not cross the BBB. In both cases, the permeability of the BBB must be known and should be evaluated at the earliest possible stage of testing. Intestinal absorption is particularly important for oral medications that are transported into the blood via the intestinal tract [11,18]. Conventionally, biomimetic descriptors require animal testing (e.g., rats, dogs, monkeys, or humans). In vivo tests are extremely unethical and inhumane. They also require significant financial outlays and time that are inconsistent with the results that are achieved. Over the past few decades, along with the rapid development of new computational discoveries, combinatorial chemistry, and high-throughput biological research, it has become possible to accelerate the selection of "ideal" drug candidates for further development. If the structure of a compound is known, then it is possible to predict its lipophilic, biological, and physicochemical properties. However, in silico methods do not provide reliable results for substances with more complex structures. Compared with conventional methods, chromatography using biomimetic systems, recognized as an in vitro technique, is becoming increasingly popular.
The compounds that were tested 1-19 belong to three anticancer active classes of structurally related small molecules ( Table 1) that share the same privileged heterocyclic scaffold [19][20][21]. Moreover, two classes of compounds possess isosteric groups such as the isopropyl in 1-6 and trifluoromethyl in 7-14. Two novel sets of fused azaisocytosinecontaining congeners 1-6 and 7-14 are antimetabolites that possess the elucidated mechanism of their antiproliferative action (by caspase activation). They were synthesized in our laboratory and patented. These azaisocytosine-containing congeners were described in our earlier paper in which their medical anticancer utility was also mentioned [19]. Most of the title molecules exhibited more potent cytotoxicity in human cancer cells than a clinically approved anticancer agent, pemetrexed, and also revealed similar or more protective effects than that of ascorbic acid and Trolox in an ex vivo model of rat erythrocytes exposed to oxidants. The compounds 8, 10, 11, 17, and 19 exhibited the clearly higher antiproliferative effects in cancer cells than in normal cells [19][20][21]. In addition, the compounds 7, 8, and 10-12 were shown to be safer for the early life stages of Danio rerio than pemetrexed [22]. Due to the important pharmacological activity, all these organic substances require more thorough and extensive research on modeling their pharmacokinetic properties. Table 1. The compounds tested and their structure, molecular weight (MW), topological polar surface area (TPSA), polarizability (α), pharmacokinetic parameters (log K p , log K a,HSA , log BB, Caco-2, f u.brain ), and lipophilicity (log P). In our present research, we used the following protocol: (1) The in vitro determination of chromatography-based lipophilicity parameters of the tested compounds using reversed-phase materials capable of imitating pharmacokinetic and partitioning processes in biological systems and an end-capped ODS column; (2) the in silico calculation of structural and electronic descriptors (topological polar surface area, polarizability, and molecular weight); (3) the in silico calculation of partition coefficients (log P) and pharmacokinetic properties (e.g., log BB, f u,brain , log K p , log K a,HSA , and Caco-2) affecting drug-like properties of the tested compounds from molecular structures using an ACD/Percepta software; (4) the establishment and validation of new QSAR models, which make it possible to predict the pharmacokinetic properties (such as log BB, f u,brain , log K p , log K a,HSA , and Caco-2) of the title compounds on the basis of their experimental lipophilicity parameters and structural and electronic descriptors; and (5) the visualization of correlations between the dependent solute properties obtained from newly constructed QSAR models and those established in silico.

Chromatographic Data
There are several theories that describe the effect of the concentration of the surfactant in the effluent on the retention of the solute in MLC [23]. The following Foley's equation is best known in lipophilicity studies [24]: where [M] is the total concentration of the surfactant in the mobile phase minus CMC; K AM is the constant that describes solute-micelle binding; and k m is the solute retention parameter at the micellar concentration of zero, i.e., when the concentration of the surfactant monomer is equal to CMC. The K AM and k m parameters can be evaluated from the slope and intercept of the experimental 1/k vs.
[M] relationships. Equation (2) describes a linear dependence between decreasing retention and increasing micelle concentration. This equation is valid for aqueous solutions of surfactant or mobile phases with the same concentrations of the organic modifier. The micellar retention parameter, log k m , is considered analogous to the log k w parameter evaluated in RPLC. According to the information presented above, this parameter is considered to be a lipophilicity descriptor, and Equation (2) is a simple way to indirectly determine the lipophilic properties of compounds. It is postulated that retention in micellar chromatography depends on hydrophobic (lipophilic), electronic, and steric features of the compounds in a similar way as has been noted concerning pharmacokinetic phenomena. The additional similarity results from the fact that the phospholipids, cholesterol, fatty acids, and triglycerides included in the extracellular and intracellular fluids also form micelles with proteins.
In our investigations, the micellar retention factors were used to calculate the log k m parameters using Equation (2) ( Table 2). For all of the compounds 1-19, the relationships of 1/k vs. [M] were obtained with very good linearity, confirming that Foley's equation correctly describes the effect of the concentration of the surfactant in the effluent on the retention of the solute. In our studies, the micellar log k m parameters for three pharmacologically active sets of compounds 1-6, 7-14, and 15-19 and the log k factors for solutes 15-19 obtained on IAM and end-capped ODS stationary phases were determined experimentally. All these retention factors, together with the log k w values for compounds 1-14 obtained in our earlier investigations on IAM and end-capped ODS stationary phases [19], were used as lipophilicity descriptors in the QSARs methodology to predict the pharmacokinetic descriptors of the compounds that were tested.

Establishment of Quantitative Structure-Activity Relationships
In the QSARs methodology, we used the experimentally derived lipophilicity (chromatographic parameters: log k m , log k w,IAM , and log k w,ODS ), structural (topological polar surface area TPSA, molecular weight MW), and electronic (polarizability α) descriptors as independent variables. These values were used to predict different pharmacokinetic parameters (dependent variables) evaluated for tested compounds ( Table 2). Table 3 shows the quantitative structure-activity relationships (expressed as Equations (3)-(22)) that were established. The relationships were validated, and the results are presented in Table 4. The statistical parameters allowed us to evaluate the derived QSAR equations as being very good. There were no significant cross-correlations between the parameters that characterized the substances, i.e., the values of the variance inflation factor (VIF) were significantly lower than 5. The diagrams presented in Figure 1A show the standard coefficients of Equations (3)-(5) established for the log K p as an example. The remaining diagrams are presented in Figures S1A-S4A in the Supplementary Material. They explain both the direction and strength of the impact of a given structural descriptor on the calculated biological parameter. The correlations shown in Figure 1B illustrate the relationships between the log K p values calculated with the ACD/Percepta software (actual response) and those predicted by the QSARs models (calculated response) that were developed (Equations (3)- (5)). The remaining diagrams are presented in Figures S1B-S4B in the Supplementary Material. The applicability domain (AD) of the developed regression models was also evaluated and visualized as the Williams plots ( Figure 1C and Figures S1C-S4C in the Supplementary Material). AD is a theoretical region in physicochemical space (the response and chemical structure space) for which a QSAR model should make predictions with a given reliability. The warning leverage limits (h* = 0.789 and 0.632) were calculated using the following equation: where k is a number of descriptors used in the MLR model and n is the number of compounds in the dataset. The Williams plot can be used for graphical detection of outliers (h > h*) [25]. The results proved that the obtained models are valid within the domain in which they were developed.

Assessing the Risk of Undesired Effects
Many of potential molecular pharmaceutics cannot be subjected to clinical trials on humans due to the risk of serious adverse effects. Hence, in silico tools such as the OSIRIS Property Explorer (http://www.organic-chemistry.org/prog/peo/ accessed on 25 March 2021), recommended by Food and Drug Administration, are helpful in qualitative prediction of serious adverse side effects in the early stage of the drug development process. For all the investigated compounds 1-19 considered as potential anticancer agents, no risk of mutagenicity, tumorigenicity, and irritating effects were predicted. In addition, no risk (for compounds [1][2][3][4][5][6][7][8][9][10][11][12][13][14] or medium risk (for compounds [15][16][17][18][19] of reproductive effects was found. This is as expected due to the lack of "genotoxicophore" fragments in the tested molecules. The results are shown in Table S1 in the Supplementary Material.

Discussion
The lipophilic properties of compounds increase their binding to human serum albumin and to the lipids contained in biological membranes. In the investigations, the lipophilic properties of our compounds were described based on their chromatographic retention (log k m , log k w,ODS , and log k w,IAM ), and the positive effect of these parameters on the log K p , log K a,HSA , log BB, Caco-2, and f u,brain values was obtained (Figure 1, Figures S1-S4 in the Supplementary Material). Taking into account the standardized coefficients, the lipophilicity had a similar, moderate impact on the above parameters. The same effect was found for α, i.e., the polarizability of the molecules. This parameter increased the strength of the van der Waals interactions between the solutes and the albumin or lipids molecules [26]. Thus, the polarizability of the molecule increased the values of log K p , log K a,HSA , and log BB. Polarizability increased the values of log K p and log K a,HSA similarly to or slightly more than lipophilicity. In the case of log BB, polarizability seemed to be the dominant positive factor. We observed no effect of polarizability on the values of Caco-2 and f u,brain (Equations (12)- (17)). The positive effect of molecular weight (MW) on the values of log K p , log K aHSA , and Caco-2 could be explained by the partition mechanism of the permeation of the tested substances through biological membranes as well as human serum albumin. Similarly, Abraham et al. [27] obtained the positive effect of molecular size on the permeability through the skin. This relationship is a reflection of the correlation between the size of the molecules and lipophilicity. In addition, molecular size has a negative correlation with diffusivity in biomembranes, confirming that the effects of partitioning are more dominant than the effects of diffusion [28].
The polar molecular surface area (PSA) is defined as the surface area occupied by the nitrogen and oxygen atoms and the polar hydrogens bonded to these heteroatoms. The penetration of substances through biological barriers decreases when the hydrophilic part of its surface increases. PSA has been used extensively as a molecular descriptor in the studies of drug transport properties, such as intestinal absorption [29], BBB penetration [30], and membrane permeability [28,31,32]. Topological surface area (TPSA), a convenient measure of the polar surface area, was introduced by Ertl et al. [33] as the effect of the additive fragment method and is extremely popular in medicinal chemistry [34] for predicting the properties of ADME. In our research, we observed a significant negative impact of TPSA on log K p , log BB, and the Caco-2 parameters (Equations (3)-(5) and (9)- (14)). The increase of the polar surface area decreased the permeability through the skin, permeation of the blood-brain barrier, and intestinal permeability.
The factors that increase the substances that bind to serum albumin and lipids cause a simultaneous reduction of the unbound fraction in the brain, f u,brain . The equations derived in our studies (Equations (15)-(17)), Figure S4 in the Supplementary Material) show that f u,brain decreased with increasing lipophilicity and molecular weight (MW) but increased with the hydrophilicity (TPSA) of the compound. Polarizability had a negligible effect on the f u,brain values.
In RPLC, the standard lipophilicity descriptors are the log k w parameters evaluated for water (buffer) as the mobile phase. In the case of micellar chromatography, the log k m values were used (Equation (2)), corresponding to the mobile phase without any "free" surfactant molecules. In general, the determination of these parameters is time-consuming and requires multiple measurements using different mobile phases. Nevertheless, the quantities determined in this way are more reliable and similar to the partitioning parameter, log P. Frequently, in practice, the chromatographic parameters measured with mobile phases that contain an organic modifier can also be used to evaluate lipophilicity. Most often, experimental data are used that were measured with columns imitating biological systems, such as artificial membranes, immobilized cholesterol, and others. In our studies, we obtained very good linear correlations between the log k values obtained in MLC for mobile phases with different concentrations of Brij 35, i.e., 0.15 mol/L, 0.10 mol/L, 0.125 mol/L, and 0.075 mol/L ( Table 2). The correlation factors of these relationships were in the range of 0.902-0.942. Therefore, we decided to use the log k parameters measured in one micellar effluent to derive the quantitative structure-activity relationships. We chose the values measured in the mobile phase composed of 0.1 mol/L of surfactant Brij 35, i.e., log k 0.1 . For this mobile phase, the retention of individual substances was not too high (log k values in the range of 0.279-1.67). At the same time, the flow of effluent through the column was not associated with high pressure. Appropriate equations (Equations (18)- (22)) and statistics are presented in Tables 3 and 4. In the statistical evaluation, these equations were similar and almost as good as those derived for the log k m , log k w,IAM , and log k w,ODS parameters. The results indicate the effectivity of micellar chromatography and its predictive ability in assessing the properties of bioactive substances. This technique also provided the advantage of being able to mimic biopartitioning systems. On the basis of the chromatographic measurements performed in one system with a micellar mobile phase, our results show that there is a high probability that the pharmacokinetic properties of the tested compounds can be predicted accurately.

Chromatographic Conditions
In the MLC technique with an ODS-2 column, buffered Brij 35 mixtures (0.15; 0.125, 0.10, and 0.075 mol/L) with 7% (v/v) addition of isopropanol were used as mobile phases. The buffer was prepared from 0.01 mol/L solutions of Na 2 HPO 4 and citric acid, and the pH 7.4 value was fixed before mixing with an organic modifier. The flow rate was 1 mL/min. Buffered acetonitrile mixtures were used as effluents with the IAM column. Acetonitrile concentration, expressed as a volume fraction, was changed in the range of 0.2-0.5, with the constant step of 0.1. The flow rate was 1.3 mL/min. Acetonitrile concentration was changed in the range of 0.3-0.6 with the RP-18e column, with the constant step of 0.1 and flow rate of 0.1 mL/min. As solutes tested there were used 19 newly designed structurally related compounds. Samples were dissolved in acetonitrile c.a. 0.005 mg/mL. The compounds were detected under UV light at λ max 254 nm. All measurements were carried out at a constant temperature (25 • C). The dead time values were measured from non-retained compound (e.g., sodium chloride) peaks. All reported k values are the average of at least 3 independent measurements.

In Silico Calculations
Molecular weight (MW), topological polar surface area (TPSA) and polarizability (α) of the tested compounds (as independent variables), as well as pharmacokinetic parameters characterizing their distribution between the blood and brain (log BB), fraction unbounded in brain (f u,brain ), water-skin permeation (K p ), binding to human plasma proteins (log K a,HSA ), intestinal permeability (Caco-2) (as dependent variables), and the logarithms of n-octanol/water partition coefficient (logs P), were evaluated by ACD/Percepta software (Łodź, Poland). In this software, the logs P and pharmacokinetic descriptors are calculated from Abraham solvation parameters (i.e., the McGowan volume, polarizability/dipolarity, hydrogen bond basicity (accepting ability) and hydrogen bond acidity (donating ability), excess molar refraction, etc.), according to the concept of LSERs (linear solvation energy relationships) [35].
The risk of adverse effects of the investigated compounds was evaluated by the OSIRIS software, which is available online: http://www.organic-chemistry.org/prog/peo/ (accessed on 25 March 2021). This in silico tool uses the final datasets from the Registry of Toxic Effects of Chemical Substances (RTECS) database containing 7504 mutagenic, 2841 tumorigenic, 2372 irritant, and 3570 reproductive effective substances, as well as 3343 pharmaceutics as a control set. The qualitative prediction result encoded in green, yellow, and red indicates no risk, medium risk, and high risk of undesired effects, respectively.

Conclusions
Two-dimensional QSAR methodology was successful in modeling pharmacokinetic properties, i.e., the distribution between the blood and brain (log BB), the unbounded fraction in the brain (f u,brain ), water-skin permeation (log K p ), binding to human plasma proteins (log K a,HSA ), and intestinal permeability (Caco-2) of fused azaisocytosine-containing congeners. Various liquid chromatography techniques were used to characterize all the title compounds regarded as promising drug candidates. Micellar parameters (log k m ) and log k w values measured on an artificial membrane (IAM) and on an end-capped ODS column were compared as lipophilicity descriptors and applied in the QSARs methodology. Apart from the chromatography-derived lipophilicity, the quantitative structure-activity relationships included both structural and electronic descriptors related to drug-like properties, i.e., topological polar surface area, molecular weight, and polarizability of the investigated molecules. All the derived QSAR equations were evaluated statistically and validated as being very good. It should be noted that the QSAR models that were developed revealed a high predictive ability and therefore provided reliable predictions in modeling the pharmacokinetic properties of the title molecules. All models used for prediction of the dependent solute property linked the retention parameters on MLC, IAM, and ODS with additional molecular descriptors related to drug-like properties. All the dependent pharmacokinetic properties obtained on the basis of QSAR equations were compared with those calculated in silico and were statistically validated as being very good. Applicability domains of the developed regression models were evaluated and visualized. The investigations highlight the significance and possibilities of combined chromatographic techniques and QSARs methods in modeling important pharmacokinetic properties of our structurally related small molecules and reducing unethical animal testing. The micellar liquid chromatography technique made it possible to achieve a significant reduction in the time and cost of the experiments and also reduced the consumption of organic reagents. The results presented in this study will be particularly useful in further, more extensive in vivo research of the title compounds that are being considered as potential drugs.