Evaluation of Physicochemical Properties of Ipsapirone Derivatives Based on Chromatographic and Chemometric Approaches

Drug discovery is a challenging process, with many compounds failing to progress due to unmet pharmacokinetic criteria. Lipophilicity is an important physicochemical parameter that affects various pharmacokinetic processes, including absorption, metabolism, and excretion. This study evaluated the lipophilic properties of a library of ipsapirone derivatives that were previously synthesized to affect dopamine and serotonin receptors. Lipophilicity indices were determined using computational and chromatographic approaches. In addition, the affinity to human serum albumin (HSA) and phospholipids was assessed using biomimetic chromatography protocols. Quantitative Structure–Retention Relationship (QSRR) methodologies were used to determine the impact of theoretical descriptors on experimentally determined properties. A multiple linear regression (MLR) model was calculated to identify the most important features, and genetic algorithms (GAs) were used to assist in the selection of features. The resultant models showed commendable predictive accuracy, minimal error, and good concordance correlation coefficient values of 0.876, 0.149, and 0.930 for the validation group, respectively.


Introduction
Numerous drug candidates are dismissed in clinical trials due to insufficient pharmacokinetic properties [1,2].Therefore, optimizing the physicochemical properties of potential drug molecules at the initial stage of drug development becomes crucial.The optimization is necessary to attain the desired drug metabolism and pharmacokinetic profile in vivo.The lipophilicity of a molecule is a well-known factor that affects its toxicity, absorption, distribution, metabolism, and elimination [3].Consequently, lipophilicity assessment is one of the basic tests of drug candidates in early drug discovery.The chromatographic approach is the most frequently used among available methods since it offers several advantages compared to the traditional shake-flask procedure.The chromatographic approach requires minimal amounts of sample while being insensitive to impurities, and it is fully automated.In addition, the results of the chromatographic analyses are both repeatable and robust.
Therefore, the solid-liquid partitioning methods are highly convenient in the early stages of the drug discovery pipeline, prioritizing high throughput over accuracy [1,4].
The chromatographic approach also allows for measuring other bio-physicochemical properties of given molecules, such as affinity to phospholipids and plasma proteins (PPs) [5].Plasma protein binding (PPB) mostly affects drug distribution, half-life, and clearance.A molecule bound to PP cannot enter organ tissue via passive diffusion through the physiological barriers.Only unbound drug molecules may interact with therapeutic targets, which means that molecules with high affinity to PPs (above 95%) will show limited brain penetration and low clearance and may cause drug safety issues due to serious drug-drug interactions.Low affinity to PPs, on the other hand, reduces the duration of drug action.
This work assessed the lipophilicity properties of previously synthesized libraries of ipsapirone derivatives designed to affect dopamine (D 2 R) or serotonin receptors (5-HT 1A R) to reduce the symptoms of depression or schizophrenia [6,7].In the case of drug candidates targeting the central nervous system (CNS), lipophilicity is an essential property since it determines passive diffusion through the blood-brain barrier (BBB).
Using a chromatographic method, we experimentally determined lipophilicity indices of target ipsapirone derivatives.In parallel, we also calculated the lipophilicity of studied molecules using several computational software.Additionally, the affinity to phospholipids was determined using immobilized artificial membrane (IAM) chromatography.Quantitative Structure-Retention Relationship (QSRR) models were proposed to understand better which molecular descriptors influence their lipophilicity.Furthermore, the affinity to human serum albumin (HSA), which is dominantly plasma protein (PP), was determined, and the relationship between the affinity to HSA and experimental and computational lipophilicity was examined.

Lipophilicity Assessment
Computational approaches for lipophilicity estimation offer several advantages over experimental methods, including quick calculation times and a reduced use of chemical reagents.Moreover, computational methods allow for the prediction of lipophilicity before synthesis, making it relevant to designing potential drug candidates [8].However, it is important to note that several studies demonstrated that the calculated LogP can differ significantly from the actual value [9][10][11].
In Table 1, calculated lipophilicity indices of functionalized ipsapirone derivatives are summarized.Discrepancies in the computed LogP values are evident across various molecules, with notable differences observed in specific instances: molecule 9 exhibits a substantial variance of 1.87 between the iLogP and Silicos-IT LogP descriptors; molecule 2 shows a difference of 1.75 between iLogP and Silicos-IT LogP; and molecule 19 displays a disparity of 1.72 between MLogP and WlogP.
These variations can be elucidated using the diverse algorithms utilized in the computational methods.In Table 2, the basic descriptors of each algorithm are summarized.The lowest LogP values are derived from the MLogP, reaching the minimum value in 15 instances, while the remaining 11 compounds achieved their lowest values using the Silicos-IT LogP descriptor.The MLogP, or Moriguchi octanol-water partition coefficient, calculated using AlvaDesc software (version 2.0.10), is based on a qualitative structure-logP relationship utilizing topological indices along with molecular properties.The Silicos-IT LogP, calculated using SwissADME (http://www.swissadme.ch,accessed on 1 February 2024), is a hybrid method combining a fragmental approach with a topological one.The arithmetic mean of the values predicted using the five propose methods-average of all predictions calculated using SwissADME SwissADME Among the considered chemical compounds, molecule 9 was identified as the most hydrophilic substance according to each algorithm.The most lipophilic compound according to six descriptors was molecule 18 the LogP value was the highest for the following specified descriptors: ALogP, LogP99, LogPcons (AlvaDesc) , LogP Chemicalize , LogP, iLogP, and XLogP3.According to three other descriptors, molecule 19 emerged as the most lipophilic, based on MLogP, WLogP, and LogPcons (SwissADME) , and for one descriptor, Silicos-IT LogP, compound 25 proved to be the most lipophilic.
Considering the significant differences in the calculations obtained, the next step of our investigation considered the determination of lipophilicity using a chromatographic approach.Among the available protocols, the fast gradient approach developed by Valko was chosen because it enables the assessment of lipophilicity from a single chromatographic measurement [2,4,[12][13][14].In addition, this approach enables the determination of the acid/basic properties of molecules through the addition of experiments under different pH conditions.A summary of all the chromatographic data is presented in Table 3.The results show that the investigated ipsapirone derivatives have a rather high lipophilicity when considering the CHI scale from 0 to 100 (extrapolation is allowed).In considering that lipophilicity is a known factor influencing passive diffusion across the BBB, this is an important observation.The affinity to phospholipids can also be determined using one-gradient protocols and the IAM column.CHI IAM can be used as a cut-off point to indicate the potential for promiscuous binding and interference with phospholipids.Among the target structures, only molecule 18 had a slightly higher CHI IAM value of 50.19.
The significantly lower value of CHI under acidic conditions indicates that all ipsapirone derivatives have basic character.Therefore, the CHI under pH 10.6 can be considered and converted to chromatographically determined LogP, called CHI LogP.A cluster analysis (CA) was performed to compare chromatographically determined and calculated lipophilicity.A CA can be used for the visualization of similarities and differences between studied objects, in this case, lipophilic parameters.The obtained results clearly indicate that significant differences between theoretical and experimental data occurred, and they formed two separate groups (Figure 1).The differences are also visible on the calculated correlation matrix (Figure 2).Moreover, even the more complex algorithms or calculations based on 3D optimized structures do not have significant improvement, and all theoretical descriptors are similarly correlated with CHI LogP (r between 0.77 to 0.83).
between studied objects, in this case, lipophilic parameters.The obtained results clearly indicate that significant differences between theoretical and experimental data occurred, and they formed two separate groups (Figure 1).The differences are also visible on the calculated correlation matrix (Figure 2).Moreover, even the more complex algorithms or calculations based on 3D optimized structures do not have significant improvement, and all theoretical descriptors are similarly correlated with CHI LogP (r between 0.77 to 0.83).

QSRR Modeling of Chromatography Determined Lipophilicity and Phospholipophilicity
The following step of our study focuses on QSRR modeling.The QSRR approach, introduced by Kaliszan [15], is currently one of the most widely used and powerful computational methods in the analytical field of chemistry.Numerous QSRR studies have been reported, mainly focused on retention prediction [16], supporting the identification of molecules, mostly in untargeted metabolomics [17], or on the comparison of chromatographic columns and systems [18].
Another advantage of the QSRR approach is the possibility of obtaining insights into the molecular mechanism of retention in the utilized chromatographic system, which can be directly transformed for the relationship between molecule structure and a physicochemical endpoint measured chromatographically [19,20].
Our work focused on the application of QSRR to gain insight into the descriptors that determine the lipophilicity of ipsapirone derivatives.The goal was achieved using a hybrid approach of a genetic algorithm (GA) and multiple linear regression (MLR).In summary, a GA is a stochastic method that assists in solving variable selection problems.Therefore, integrating a GA and MLR may benefit the development of a highly accurate and predictive QSRR model.Table 4 presents a summary of the derived QSRR models together with statistical figures.
Table 5 lists the whole name of each employed molecular descriptor, with its description and assigned block.
The obtained QSRR models indicated some similarities between chromatographically measured lipophilicity and phospholipophilicity. First, in both cases, the LLS_01 descriptor, referring to the local lipophilicity of molecules based on a score derived from the rules proposed by Congreve et al., plays an important role [21].Furthermore, both models contain CATS descriptors.The CATS descriptors were created by Schneider and are members of the correlation-vector descriptor class, related to the atom-pair descriptor class [22].The CATS descriptors code the frequencies of atom-type pairings, which may represent possible pharmacophoric locations and are adjusted using the topological characteristics of molecules.The CATS descriptors cover five potential pharmacophore points (PPPs): lipophilic (L), positively charged (P), negatively charged (N), hydrogen-bond acceptor (A), and hydrogen-bond donor (D).Although, in the case of the IAM model, the percentage of N atoms completed the model, and it should be highlighted that all molecules have the same chemical characteristics and are typical organic bases.

QSRR Modeling of Chromatography Determined Lipophilicity and Phospholipophilicity
The following step of our study focuses on QSRR modeling.The QSRR appro introduced by Kaliszan [15], is currently one of the most widely used and powe computational methods in the analytical field of chemistry.Numerous QSRR studies h been reported, mainly focused on retention prediction [16], supporting the identifica of molecules, mostly in untargeted metabolomics [17], or on the comparison chromatographic columns and systems [18].
Another advantage of the QSRR approach is the possibility of obtaining insights the molecular mechanism of retention in the utilized chromatographic system, which be directly transformed for the relationship between molecule structure an physicochemical endpoint measured chromatographically [19,20].
Our work focused on the application of QSRR to gain insight into the descriptors determine the lipophilicity of ipsapirone derivatives.The goal was achieved usin hybrid approach of a genetic algorithm (GA) and multiple linear regression (MLR summary, a GA is a stochastic method that assists in solving variable selection probl Therefore, integrating a GA and MLR may benefit the development of a highly accu and predictive QSRR model.Table 4 presents a summary of the derived QSRR mo together with statistical figures.The models obtained are well fitted, as indicated by the statistical figures of the training and testing sets, including the R 2 , Q 2L OO , and RMSE TR , and exhibit appropriate predictive parameters, such as the RMSE P .Furthermore, as confirmed through a Williams plot, the proposed model's applicability domain (AD) indicated good predictions (Figure 3).

Interaction between Plasma Protein
Utilizing columns modified by plasma proteins, such as human serum albumin (HSA), allows for estimating binding to plasma protein (PPB).Generally, PPB affects drug pharmacokinetics, including distribution, half-life, and clearance [23][24][25].While lipophilicity is a well-known factor in determining PPB, it should also be considered in early drug discovery.In general, more lipophilic compounds tend to have higher percentages of PP due to nonspecific interactions with proteins.However, hydrophilic compounds can also bind strongly to PP through spherical and electrostatic interactions by binding in protein pockets.
Chromatographically determined HSA affinity can be expressed as logK HSA , ranging from −0.8 to 1.9, or recalculated to a more informative % of HAS binding.Analyzing the structures showed moderate affinity to HAS, except for molecule 23, which is relatively low-binding to HAS (logK HSA − 0.05 and %HSA = 47.66), and molecule 18, which showed a higher affinity to reference diclofenac.
The next step of our investigation focused on finding molecular properties of ipsapirone derivatives that influence the affinity to HSA.Based on computational descriptors, the model yielded predictive statistics above 0.7 (Table S1 in the Supplementary Materials).This can be considered an acceptable model; however, an attempt was made to achieve a model with a more satisfactory result.To model this endpoint, chromatographically determined lipophilicity and phospholipophilicity were used.The CHI IAM descriptor was selected as a significant descriptor based on a GA.The statistics of this model significantly improved, providing more satisfactory results (Figure 4).

Interaction between Plasma Protein
Utilizing columns modified by plasma proteins, such as human serum albumin (HSA), allows for estimating binding to plasma protein (PPB).Generally, PPB affects drug pharmacokinetics, including distribution, half-life, and clearance [23][24][25].While lipophilicity is a well-known factor in determining PPB, it should also be considered in early drug discovery.In general, more lipophilic compounds tend to have higher percentages of PP due to nonspecific interactions with proteins.However, hydrophilic compounds can also bind strongly to PP through spherical and electrostatic interactions by binding in protein pockets.The results suggest that the experimental phospholipophilicity data align more closely with the predicted interactions with PP.The correlation matrix, PCA, and HCA (Figure 1) were used to check this suggestion, which indicate that experimental data were more effective for estimating certain biological properties, such as interactions with PP.All investigated methods indicated that experimentally measured lipophilicity or phospholipophilicity are better for predicting interactions with PP since these experiments were grouped together in the CA results.
descriptors, the model yielded predictive statistics above 0.7 (Table S1 in the Supplementary Materials).This can be considered an acceptable model; however, an attempt was made to achieve a model with a more satisfactory result.To model this endpoint, chromatographically determined lipophilicity and phospholipophilicity were used.The CHIIAM descriptor was selected as a significant descriptor based on a GA.The statistics of this model significantly improved, providing more satisfactory results (Figure 4).The results suggest that the experimental phospholipophilicity data align more closely with the predicted interactions with PP.The correlation matrix, PCA, and HCA (Figure 1) were used to check this suggestion, which indicate that experimental data were more effective for estimating certain biological properties, such as interactions with PP.All investigated methods indicated that experimentally measured lipophilicity or phospholipophilicity are better for predicting interactions with PP since these experiments were grouped together in the CA results.

Solvents
Buffer solutions (polar solvents for HPLC experiments) were prepared by dissolving HPLC-grade ammonium acetate (VWR International, Leuven, Belgium) in ultrapure water obtained from a Milli-Q water purification system (Merck Millipore, Darmstadt, Germany).To adjust the buffer solutions' pH, two concentrated solutions were used: ammonia (Avantor Performance Materials Poland S.A., Gliwice, Poland) and acetic acid

Analytes
The compounds studied were the ipsapirone derivatives described in the articles on their pharmacological activity and synthesis [6,7].All investigated molecules are are listed in the Supplementary Materials (Table S2).

Chromatographic Analysis
All experiments were performed using a high-performance liquid chromatography system (Shimadzu Prominence LC-2030C 3D) equipped with a DAD detector and controlled through a LabSolution system (version 5.90, Shimadzu, Tokyo, Japan).
For RP-HPLC experiments, a C 18 Hypersil GOLD TM (50 mm × 4.6 mm; 5.0 µm with a guard column; Thermo Scientific, Waltham, MA, USA) column was applied.The column temperature was set to 40 • C. Three water solutions were used as mobile phase A: acetic acid at pH 2.6, mM ammonium acetate at pH 7.4, and 50 mM ammonium acetate at pH 10.5.Mobile phase B was acetonitrile (ACN).The linear gradient from 2 to 98% ACN was applied from 0 to 5.25 min and held at 98% ACN for 1.75.The mobile phase flow rate was 1.5 mL/min throughout the experiment.
The immobilized artificial membrane column (IAM.PC.DD2, 10 × 4.6 mm × 10.0 µm with a guard column (Regis Technologies; Morton Grove, IL, USA) was used for IAM-HPLC experiments.The column temperature was set to 30 • C. Similar to RP-HPLC, mobile phase A was a 50 mM ammonium acetate at pH 7.4, and mobile phase B was acetonitrile.The linear gradient from 0 to 85% ACN was applied from 0 to 5.25 min and held at 98% ACN for 0.5 min at a constant flow rate of 1.5 mL/min.
For HSA-HPLC experiments, the column was Chiralpak ® HSA (100 × 4 mm; 5 µm with safety guard column; Daicel Chiral Technologies, West Chester, PA, USA).The column temperature was set to 30 • C. The same phase A as in the IAM-HPLC case was used, and the mobile phase B was isopropanol (i-PrOH).The linear gradient from 0 to 20% i-PrOH was applied from 0 to 15 min, held at 20% i-PrOH for 12 min, and then returned to pure ammonium acetate solution.The mobile phase flow rate was 0.9 mL/min throughout the experiment.Three minutes of column recalibration was applied between each run.The retention times were collected at wavelengths between 190 and 300 nm, and the injection volume was 10 µL.

Theoretical Descriptors
Theoretical descriptors were calculated using Chemicalize software (https://chemicalize. com, accessed on 1 February 2024) and alvaDesc software (version 2.0.10,Alvascience, Lecco, Italy) based on geometry optimization using Baker's EigenFollowing method using MOPAC software (version 3.0).Then, constant, almost constant, and highly correlated (r = 0.95) descriptors were removed.The final number of descriptors was 3170.In addition, several lipophilicity indices were calculated for each compound using the SwissADME web application (http://www.swissadme.ch,accessed on 1 February 2024) based on the SMILES notation.

CA Analysis
A CA was performed on databases that included chromatographic data and in silicocalculated lipophilicity indices.In order to eliminate the impact of various lipophilicity scales, data were standardized before analysis.The CA was conducted using Ward's agglomeration rule and the Euclidian distance measure using a self-written R script.

QSRR Analysis
The process of choosing descriptors was facilitated through the utilization of a genetic algorithm (GA), while a multiple linear regression (MLR) method was applied for a regression analysis using QSARINS version 2.2.4 software, developed by Gramatica et al. [26,27].The parameters governing the genetic algorithm were determined by specifying a population size of 10, a mutation rate of 20, and 500 generations per size.Prior to the computation of GA-MLR for each modeled endpoint, the target solutes were allocated into distinct groups, comprising a training group (n = 19) and a validation group (n = 7).An analysis of the LogK HSA endpoint was conducted, incorporating two previously considered endpoints, namely CHI C18 and CHI IAM , as descriptors.

Conclusions
In summary, our study employed a combined experimental and computational approach to investigate the lipophilicity of ipsapirone derivatives.We observed significant disparities between calculated LogP values from computational methods.Additionally, the differences between calculated and chromatographically established results were observed through a CA.Relatively high lipophilicity indices suggest that the investigated structure should be useful for crossing the BBB.Our QSRR modeling efforts identified key molecular descriptors influencing lipophilicity, phospholipophilicity, and binding to HSA.The models related to lipophilicity and affinity to phospholipids present CATS 3D and drug-like indices as investigated descriptors.What is important is the integration of experimental data into predictive models for plasma protein binding, which improved the model performance, emphasizing the importance of chromatographic assessments.

Figure 1 .
Figure 1.Results of CA analysis.Figure 1. Results of CA analysis.

Figure 1 .
Figure 1.Results of CA analysis.Figure 1. Results of CA analysis.

Molecules 2024 , 6 Figure 2 .
Figure 2. Correlation matrix between calculated and experimentally determined lipophi indices and affinity to HSA.

Figure 2 .
Figure 2. Correlation matrix between calculated and experimentally determined lipophilicity indices and affinity to HSA.

Figure 3 .
Figure 3.Comparison between experimental retention indices and those predicted by models and Williams plot for (A) C18-bonded stationary phase using pH 7.4 buffer and (B) IAM stationary phase.Green dots refer to the training set, whereas purple ones refer to the validation set.

Figure 3 .
Figure 3.Comparison between experimental retention indices and those predicted by models and Williams plot for (A) C 18 -bonded stationary phase using pH 7.4 buffer and (B) IAM stationary phase.Green dots refer to the training set, whereas purple ones refer to the validation set.

Figure 4 .
Figure 4. Comparison between the experimental retention indices and those predicted by models and Williams plot for HSA stationary phase for the best-obtained model (model 3).Green dots refer to the training set, whereas purple ones refer to the validation set.

Figure 4 .
Figure 4. Comparison between the experimental retention indices and those predicted by models and Williams plot for HSA stationary phase for the best-obtained model (model 3).Green dots refer to the training set, whereas purple ones refer to the validation set.

Table 1 .
The calculated LogP values of the ipsapirone derivatives concerning the computational model.

Table 2 .
List of software used with information regarding algorithms.

Table 3 .
The summarized values of the chromatographic and biochromatographic indices of the target functionalized ipsapirone derivatives.

Table 5 .
Full name and block for molecular descriptors applied in QSRR analysis.