Studies on Log Po/w of Quinoxaline di-N-Oxides: A Comparison of RP-HPLC Experimental and Predictive Approaches

As reported in our previous papers, a series of quinoxaline-2-carboxamide 1,4-di-N-oxide derivatives were synthesized and studied as anti-tuberculosis agents. Here, the capability of the shake-flask method was studied and the retention time (expressed as log K) of 20 compounds were determined by RP-HPLC analysis. We found that the prediction of log P by the RP-HPLC analysis can result in a high accuracy and can replace the shake-flask method avoiding the experimental problems presented by quinoxaline di-N-oxides. The studied compounds were subjected to the ALOGPS module with the aim of comparing experimental log Po/w values and predicted data. Moreover, a preliminary in silico screening of the QSAR relationship was made confirming the influence of reduction peak potential, lipophilicity, H-bond donor capacity and molecular dimension descriptors on anti-tuberculosis activity.


Introduction
Quinoxaline derivatives are a class of compounds that show very interesting biological properties and, therefore, they are receiving increasing attention from many medicinal chemistry researchers. The quinoxaline ring has been described as a bioisoster of quinolein, naphthalene and some other heterocycles which are the base of many antimalarial, antibacterial or antitumor agents (such as quinine, mefloquine, isoniazid, pyrazinamide or tirapazamine) [1].
Taking into account the fact that QdOs are receiving growing attention in the field of medicinal chemistry, it would be interesting to study the physicochemical properties of this family of compounds. The study of absorption, distribution, metabolism and excretion should be considered in the first stages of compounds development as this information could be of great help when identifying new candidates or optimizing structures [16][17][18][19]. The main properties for studying ADME in biological systems are solubility, lipophilicity, stability, and acid-base character. Lipid solubility is one of the most important determinants of the pharmacokinetic characteristics of a drug and many properties, such as absorption, penetration or elimination are related to lipophilicity.
The logarithm of n-octanol/water partition coefficient (log P o/w ) is the most frequently used parameter for measuring as it has been shown that this partition system is a good model for many biological processes [16,17,[20][21][22]. In fact, log P o/w is also used as one of the standard properties identified by Lipinski in the "rule of 5" for drug-like molecules [23,24].
Measurement of this parameter is always recommended and many methods have been developed for this purpose. The classical shake-flask method is time-consuming. Its use is limited in the log P o/w range between -2 and 4, and it is impossible to use with surface-active materials [25]. For these reasons, many chromatographic methods have successfully been used to assess lipophilicity of organic compounds. HPLC provides an easy, reliable and accurate way to determine the partition properties of compounds based on their chromatographic retention times [16,17,26,27]. On the other hand, since the 1970s, several methods have been proposed for log P computation. These methods could be divided into two groups: property-based methods and additive methods. The former group computes log P as a function of molecular properties such as molecular surfaces, volumes, partial charges or HOMO/LUMO energies, including topological indices as descriptors [28,29]. Additive methods firstly introduced by Hansch and co-workers [30][31][32], use basic structural building blocks as descriptors. They calculate the value of a molecule by summing up the contributions from all the blocks of the structure and considering some correction factors [26,[33][34][35].
In this paper the viability of the RP-HPLC and the shake-flask methods to measure the partition coefficients of QdO is studied. Moreover, the capability of different predictive methods to properly parametrize the N-oxide function is evaluated.

Experimental log P o/w : Shake-Flask vs. RP-HPLC Method
The lipophilicity of a drug is related to its ability to cross cell membranes by means of passive diffusion. This property is usually expressed by the logarithm of the n-octanol/water partition coefficient, log P o/w . The log P o/w reflects the relative solubility of the drug in n-octanol (a model of the lipid bilayer of a cell membrane) and water (the fluid inside and outside cells). Traditionally, log P o/w values are measured using the "shake-flask" with the n-octanol and water partition system.
The photochemical instability of QdO is well-known and many studies have been reported [36][37][38]. It has been observed that the absorption spectrum of neutral QdO solutions change quickly due to exposure to sunlight. For these reasons, solutions of QdO must be kept protected from light and used as soon as prepared. With the aim of studying the suitability of the shake-flask method, four compounds (10, 14, 22 and 26) were selected and many attempts to measure the log P o/w were performed using the classical shake-flask method. Thus, it was verified that the classical "shake-flask" method is not suitable for the measurement of log P o/w values of QdO derivatives. In fact, these compounds have very low solubility in water and, above all, some of them degrade in solution (see supporting information for details) and create emulsions during the partition procedure.
For these reasons, a RP-HPLC method was used for the determination of log P o/w values of QdO derivatives. The RP-HPLC method used (see Experimental section) is applicable to the QdO derivatives because it is performed at pH = 7.4 and quinoxalines di-N-oxide derivatives are more stable in neutral solutions than working at extreme pH values. Moreover, the period of time in which the compounds are in contact with the mobile phase is not long enough for the quinoxalines to degrade as was observed when studying the corresponding chromatograms (see supporting information for details).

Correlation between log P o/w and log k' 0
In this study nine compounds (1-9) have been selected as reference compounds from the Recommended Reference Compounds list published by the OECD [39]. The chosen compounds allow building a model covering a fairly wide range of log P o/w values (i.e., from ca. 0 to 4.5). The retention times of these reference compounds have been measured and expressed as log k' 0 . The log k' values have been extrapolated to 0% methanol in order to determine the capacity factors represented as log k' 0 . To predict the log P o/w values using the log k' 0 values, the least square regression was employed to generate Equation ( The cross-validation statistical parameters determined with the Leave One Out procedure were the following: The detailed results of the LOO test are presented in Table 1.  Taking into account these values, it can be said that Equation (1) can be used to predict the log P o/w of QdO derivatives using the log k' 0 values.

log P o/w of Quinoxalines di-N-Oxide
Retention times of QdO 10-29 were measured and the capacity factors (log k') were calculated in varying proportions of methanol from 70% to 40%. The capacity factors were extrapolated to 0% methanol as in the case of the reference compounds ( Table 2). The related log P o/w values were determined using Equation 6 and are also reported in Table 2.  Examination of the data indicates the influence of the quinoxaline structure on the log P o/w values. As expected, the log P o/w values of all the QdO derivatives are positive in a range between 1.5 and 4.5. The 20 compounds can be divided into five different series based on structure. Within each series of analogues, the insertion of a methyl group resulted in a soft increase of the log P o/w value, and replacing the hydrogen atom with one or two chloro groups supposed an increase of 0.5 or 1.0 unit, respectively. Increasing the aliphatic chain between the amide group and the aromatic system resulted in an increase of the log P o/w values as can be observed comparing derivatives 10, 11, 12, 13 versus 14, 15, 16, 17. The p-bromo substituent (compounds 18-21) increases the lipophilicity of the molecules. Finally, looking at the values of compounds 26, 27, 28 and 29, it can be observed that these compounds, which contain a diphenyl substituent on the amide chain, presented the highest log P o/w values of all the QdO derivatives.
Therefore, the results suggest that the HPLC analysis is a suitable method for determining the log P o/w for quinoxaline derivatives instead of classical methods which are too slow, labor intensive and expensive. In particular, the RP-HPLC method for the QdO derivatives allows avoiding the problems associated with the classical shake-flask method, impossible to use with this kind of compounds.

Calculated log P o/w
All of the QdO were subjected to the ALOGPS online module. Seven values were obtained for each compound using different computational methods included in this module. The predicted log P o/w data are reported in Table 3. RMSE was calculated for each computational method in order to judge which method best suits experimental RP-HPLC.
The RMSE calculated from experimental and estimated log P o/w values ranged between 0.36 and 3.52 for XLOGP3 and MLOGP, respectively. From these data, XLOGP3 can be considered as the best approach to estimate log P o/w for the QdO. Nevertheless, and in an attempt to improve the predictive capacity of the ALOGPS, the LIBRARY mode was used, and experimental log P o/w data of compounds 10-13 were used to generate the library that was taken into consideration for recalculating the log P o/w of the rest of QdO (Table 4). The examination of the data reveals that ALOGPS LIBRARY mode presents a RMSE of 0.19, making it the best approach for predicting the partition coefficients of QdOs.

In Silico Screening of the QSAR Relationship
The anti-tuberculosis activity of the studied compounds has been previously reported [40,41]. In vitro evaluation of the anti-tuberculosis activity has been carried out within the Tuberculosis Antimicrobial Acquisition & Coordinating Facility (TAACF) screening program for the discovery of novel drugs for the treatment of tuberculosis. The compounds were tested against Mycobacterium tuberculosis H37Rv (ATCC 27294) in BACTEC 12B medium using the Microplate Alamar Blue Assay (MABA). Compounds showing an IC 90 value of ≤10 µg/mL were considered "Active" for antitubercular activity and considered for the VERO cell cytotoxicity assay. Cytotoxicity is determined as the CC 50 using a curve fitting program. Ultimately, the CC 50 is divided by the IC 90 to calculate a SI (Selectivity Index) value. SI values of ≥10 are considered "Active".
Since log P o/w is an important ADME parameter, a correlation with the anti-tuberculosis activity of compounds was looked for. However, it was verified that there is not a strong relationship (R 2 < 0.2) between log P o/w and activity, expressed as log(1/IC 50 ) or log(1/IC 90 ). In spite of this, it seems that a low value of log P o/w could be related with better values of anti-tuberculosis activity.
This fact could be explained as a consequence of the structure of the M. tuberculosis envelope structure. Porin presented in the membrane control the diffusion of small hydrophilic molecules; and, therefore, M. tuberculosis is more permeable to hydrophilic drugs such as INH, PZA, EMB or gatifloxacin that present a log P of −0.71, −0.71, −0.12 and −0.23, respectively (ALOGPS, [42]) In this sense, lipophilic molecules should be able to easily cross the lipid bilayer; however, the bilayer's uncommon thickness and the presence of the mycolic acids seem to dicrease the permeability to lipophilic drugs. Nevertheless, it has been observed that the more lipophilic the agents are, the more active they usually are against M. tuberculosis, as for instance, PAS, RIF or rifapentine presenting log P values of 0.62, 3.85 and 4.83, respectively (ALOGPS). This fact suggests that there must be an specific pathway for lipophilic drug transport [43].
Recent studies have focused on the influence of physicochemical properties of antibacterial drugs and they have concluded that it is not possible to establish a strong relationship between these properties and the anti-tuberculosis activity. In this sense, it seems that much work is needed in order to understand M. tuberculosis and its metabolism. At this moment, recent reviews affirm that antibacterial drugs constitute an special physicochemical space completely different from the space covered by drugs in many other therapeutic areas [44].
On the other hand, in the previous paper a good relationship was found between redox peak potential E pc,1 and activity [45]. In that case, however, the correlation was found within an almost homogeneous series of molecules. Here a more heterogeneous set of compounds has been studied. For molecules 11-24, 26-29 there is not a strong relationship (R 2 < 0.2) between peak potential and activity. However, it is evident that the compounds roughly align (according to the R substituent) or group themselves (according to the R 6 and R 7 substituents) in homologous series. This behaviour justifies the previously found relationship for a small and almost uniform set of molecules.
The above said results point out that log P o/w and E pc,1 , although important parameters, cannot explain alone the observed activity. When considering simultaneously the effect of both factors, a two-variable model was built to predict activity, resulting in a better fitting ability (R 2 ≅ 0.53). Some other variable is probably needed to improve the model.
In order to enlarge the set of possible variables, the structures of compounds 10-29 were optimised with the MOE (Molecular Operating Environment) software using a MMFF force field [46]. From the optimised structures 333 molecular descriptors were calculated. This set of parameters was reduced to 269 excluding variables with constant values. A multiple linear regression (MLR) procedure with a stepwise forward selection of the variables was applied to the whole set of variables (269 from MOE plus the experimental log P o/w and redox peak potential). This resulted in a two-variable model (R 2 ≅ 0.8) both for log(1/IC 50 ) and log(1/IC 90 ).
One selected descriptor (vsurf_HB6) belongs to the "Surface Area, Volume and Shape Descriptors" family and is related to the H-bond donor capacity. The other descriptor (SlogP_VSA4) belongs to the "Subdivided Surface Areas" group and is related to an approximate accessible van der Waals surface area calculation for each atom, along with the contribution to log P o/w (as calculated in MOE) for each atom (see Supporting Information).
The addition of both experimental log P o/w and E pc,1 to the previously selected two descriptors allowed us to build a four-variable model with R 2 = 0.85 for log(1/IC 50 ) and R 2 = 0.84 for log(1/IC 90 ).
In conclusion, the reported statistical analysis confirms the importance of the reduction peak potential E pc,1 in the definition of the cytotoxic activity. This reduction process has been demonstrated to be consistent with reduction of the N-oxide functionality to form a reactive radical anion, which could lead further to superoxide ion or other toxic oxy radical species responsible for the biological activity [45]. However, in order to obtain a more general QSAR relationship, three other descriptor should be considered. Although the relatively small number of compounds considered, with respect to the large set of descriptors, limited the fitting ability of the model (R 2 ≥ 0.84), the lipophilicity, the H-bond donor capacity and a descriptor related to the molecular dimensions were found to be involved in the modulation of the final biological activity. In particular, on the basis of the regression coefficient, the redox properties (E pc,1 ) correlate positively with both log(1/IC 50 ) and log(1/IC 90 ); this means that an easy reduction increases the activity. On the contrary, the lipophilicity and shape descriptors affect the activity negatively.

RP-HPLC Method
The RP-HPLC methods of Minick [52] and Lombardo [53] were considered for estimating the n-octanol/water partition coefficients (log P o/w ) of QdO. The inorganic salts and the reference compounds were, at least, of analytical grade (Sigma Aldrich and Fluka) and used without further purification. HPLC-grade methanol, n-octanol (Sigma-Aldrich) and bi-distilled water were used to prepare the mobile phase. The retention times (t R ) were measured using a Waters 2695 Separation Module system and a Waters 2487 Dual λ Absorbance Detector with Empower Pro Software.
The Supelcosil LC-ABZ column (5 μm, 15 cm × 4.6 mm) [18,[52][53][54] was selected to substitute the organic phase of the shake-flask method because it has been reported that this column affords a reasonable correlation model for a great variety of compounds [18,22,49,55]. The mobile phase consisted of 20 mM 3-morpholinopropanesulfonic acid (MOPS) buffer (pH 7.4) and methanol in varying proportions, from 70 to 40%. n-Octanol (0.25%) was added to methanol, and n-octanolsaturated water was used to prepare the buffer [41]. The other chromatographic conditions were: flow 1.0 mL/min −1 ; isocratic elution, UV-visible detector set at the wavelength with maximum absorbance (260 and 210 nm). The test solutions were 0.3 mM in the desired compound. All the experiments were performed at 25 ± 2 °C at least twice.
The t R of each QdO derivative 10-29 was measured at different proportions of methanol (from 70 to 40%) and injections of pure methanol were used to determine the column dead-time (t 0 ). The capacity factors were calculated according to Equation (2): Starting from these results, the extrapolation to 0% methanol for each compound was calculated and the capacity factors were used to predict the corresponding log P o/w . The capacity factors of compounds 1-9, with known log P o/w [16,20,39] and accepted as reference compounds by the OECD [39], were used to create a calibration curve [Equation (3)]: The common coefficient of determination R 2 was used to evaluate the fitting ability of the model. Another measurement for defining the accuracy of the proposed model is the RMSE (Root Mean Squared Error), which summarizes the overall error of the model [Equation (4) where ŷ i is the log P o/w value calculated by Equation (2), y i is the reference value and n is the number of reference compounds used to create the curve.

Cross-Validation of the RP-HPLC Method
In order to judge if the experimentally measured log k' 0 can be used to predict the log P o/w value, a calibration curve was created with the capacity factors of the reference compounds and a linear regression equation between the log P o/w and log k' 0 was determined.
The robustness of the model and its predictivity were evaluated by the leave-one-out (LOO) cross-validation procedure [28,56]. According to this procedure, the log P o/w value of each compound in the reference data set is predicted by the equations derived from all the other compounds except the predicted one. At the end of this procedure two values are available for each reference compound, the reference shake-flask log P o/w value (y i ) and the predicted one (ŷ i ). With these data, two statistical parameters (R 2 LOO , RMSE LOO ) were calculated to indicate the predictivity of the model. The regression coefficient (R 2 LOO ) is defined by Equation (5) where i y is the mean of the reference shake-flask log P o/w value and n is the number of reference compounds. A high value of R 2 LOO indicates a good predictive ability of the model. The root mean squared error in prediction (RMSE LOO ) is calculated with Equation (6)

log P o/w Predictive Approaches
All of the synthesized QdO were subjected to the ALOGPS module with the aim of comparing experimental log P o/w values and predicted data. The ALOGPS method is part of the ALOGPS 2. The lipophilicity calculations within ALOGPS 2.1 are based on the associative neural network approach and an efficient partition algorithm. ALOGPs was developed with 12,908 molecules from the PHYSPROP database using 75 E-state indices. 64 neural networks were trained using 50% of molecules selected by chance from the whole set. The log P o/w prediction accuracy is RMSE = 0.35 and standard mean error s = 0.26 [33].
This program also provides a possibility to include new data into the memory of neural nets without retraining the neural networks themselves in the so-called LIBRARY mode. The LIBRARY dramatically improves prediction of the ALOGPS program for the log P o/w prediction using in-house data sets [21,33,35]. The LogKow (Kow-WIN) program estimates the log P o/w of organic compounds and drugs using an atom/fragment contribution method developed at Syracuse Research Corporation [57]. The miLogP is calculated by the methodology developed by Molinspiration as a sum of fragmentbased contributions and correction factors. This method for log P o/w prediction is based on group contributions which have been obtained by fitting calculated log P o/w with experimental log P o/w for a training set of more than twelve thousand, mostly drug-like molecules. Molinspiration methodology [58] for log P o/w calculation is very robust and capable of processing practically all organic and most organometallic molecules XLOGP2 gives log P o/w values by summing the contributions of component atoms and correction factors. Altogether 90 atom types are used to classify carbon, nitrogen, oxygen, sulfur, phosphorus and halogen atoms, and 10 correction factors are used for some special substructures. The contributions of each atom type and correction factor are derived by multivariate regression analysis of 1853 organic compounds with known experimental log P o/w values [34]. The additive model implemented in XLOGP3 uses a total of 87 atom/group types and two correction factors as descriptors. It is calibrated on a training set of 8,199 organic compounds with reliable log P o/w data through a multivariate linear regression analysis [26]. ALOGP, also known as "Ghose-Crippen octanol-water partition coefficient", is a log P o/w calculated with the Ghose-Crippen contribution method based on hydrophobic atomic constants measuring the lipophilic contributions of atoms in the molecule, each described by its neighbouring atoms [59]. MLOGP, also known as "Moriguchi octanol-water partition coefficient", expresses log P o/w in terms of 13 structural parameters.

Conclusions
In this study the RP-HPLC retention times of 20 QdO derivatives were measured and it was found that highly accurate log P o/w values can be predicted from the experimental log k' 0 by using the linear regression equation relating log P o/w and log k' 0 (Equation 1). Consequently, the results suggest that the determination of log P o/w for QdO can be successfully carried out by measuring the capacity factors based on the simple RP-HPLC method. This method allows us to avoid the experimental problems presented by the classical shake-flask method when trying to measure the partition coefficients of QdO derivatives that present low stability in aqueous solution and create emulsions during partitioning.
After a comparison among different methods for the calculation of log P o/w , in cases where no experimental data is available, XLOGP3 is proposed as the best program to calculate the log P o/w values for the QdO presented in this work (RMSE = 0.36). On the other hand, the ALOGPS LIBRARY method, implemented with the experimental data, predicts log P o/w values that match the experimental ones with the lowest RMSE (0.19).
Finally, a preliminary statistical analysis confirms the importance of the reduction peak potential E pc,1 in the definition of the cytotoxic activity of QdO. Moreover, the lipophilicity, the H-bond donor capacity and a descriptor related to the molecular dimension were found to also be involved in the modulation of the final biological activity.