Design of Anti-HIV Ligands by Means of Minimal Topological Difference (MTD) Method

Molecular modeling and MTD methods are useful tools to assess both qualitative(SAR) and quantitative (QSAR) chemical structure-biological activity relationships. The 1-[(2-hydroxiethoxi)-methyl]-6-(phenylthio)thymine congeners (HEPT ligands) show in vitroanti-viral activity against the type-1 human immunodeficiency virus (HIV-1), which is theetiologic agent of AIDS. This work shows an extensive QSAR study performed upon a largeseries of 79 HEPT ligands using the MTD and HyperChem molecular modeling methods.The studied HEPT ligands are HIV reverse-transcriptase inhibitors. Their geometries wereoptimized and conformational analysis was carried out to build the hypermolecule, whichallowed applying the MTD method. The hypermolecule was used for space mapping of thereceptor’s interaction site. The obtained results show that there are three 3D molecular zonesimportant for the anti-HIV biological activity of the HEPT ligands under study.


Introduction
The acquired immunodeficiency syndrome (AIDS) appears due to an infection process with viral agents belonging to the retrovirus family: human immunodeficiency viruses 1 (HIV-1) and 2 (HIV-2) [1,2].HIV-1 is the main etiological agent of the disease.It attacks lymphocytes expressing CD4 molecules affecting profoundly the cellular mediated immunity [3].Thus, in time, the HIV-1 infection leads to the clearance of T-CD4 lymphocytes (T helper lymphocytes) allowing the appearance of several potential lethal complications: infections with opportunistic agents, carcinomas.The HIV virus contains a single RNA chain being limited by an envelope.There are genetic differences between the two types [4,5].Actually, the research by means of discovering new pharmacological anti-HIV agents is oriented towards identifying the essential critical steps of viral replication.The most studied is the reverse enzymatic transcription of genomic viral RNA (implying the reverse-transcriptase RT pathway) into double chained DNA.The new formed DNA migrates into the host cell nucleus and it is integrated into its genome by the viral integrase.Most anti-HIV agents approved by the FDA (Food and Drug Administration, USA) are RT-inhibitors [6] and they are 2',3'-dideoxynucleosidic substrate analogs (ddN), like zidovudine (AZT).Other studies proved the anti-HIV inhibition capacity of other pharmacological agents with a different chemical structure.Thus, new drugs have been developed, belonging to non-nucleoside congeners, NN (nevirapine).ddN ligands are analogs of the natural substrate, the triphosphate of dezoxynucleoside.They interact with a binding site at the level of RT.This interaction is possible in vivo when ddN compounds are phosphorilated resulting dezoxynucleosidic 5'-phosphorilated derivatives.Only in this way it is possible these compounds to inhibit competitively the natural substrates (dTTP, dCTP, dATP and dGTP) [7].The 1-[(2-hydroxiethoxi)-methyl]-6-(phenylthio) thymine congeners (HEPT ligands) [8] -see Figure 1 -show in vitro antiviral activity against HIV-1.HEPT ligands inhibit exclusively HIV-1 replication, showing no effect upon other retroviruses including HIV-2.Extensive studies [9] upon synthesis and biological activity of HEPT derivatives showed that methyl in position 5, a cyclic structure in position 6 and a two-atoms HN-3 constellation are essential elements to promote the anti-HIV activity.Tanaka et al. [10] concluded that also derivatives 2: HEPT-S have an improved antiviral activity.This is why studies upon HEPT ligands that inhibit the viral reverse-transcriptase (RT) were structurally extended and major interest was shown to analyze the anti-HIV activity depending on the variation of substituents on the phenyl ring and upon C-5 as well [10].Experimental biological activities led to structure-activity studies, yet only concerning qualitative aspects (SAR), by means of presence or absence of atomic groups in certain positions on the HEPT general structure shown in Figure 1.
Our research is focused on quantitative chemical structure -biologic activity studies based on a 3D method, the Minimal Topological Difference method (MTD).This method, beyond the improved correlation equation delivers also a structural stereochemical requirement necessary for an optimal HEPT ligand attack estimation on the HIV-1 virus.

The MTD (Minimal Topological Difference) Method
Following the Fisher's old "key-into-lock" idea [11], the main premise of the MTD approach is that each potential ligand L i should fit at the best into the binding site of the receptor and that the larger the ligand-binding site misfit, the lower affinity, i.e. the biologic activity A i , would be [12].For quantifying of this steric assumption the QSAR group of Timişoara developed through years the MTD method among the consecrated 3D-QSAR techniques [13,14].Basically, the analytical formulation of MTD can be understood in terms of volume effective picture.In such, the decrease of the ligand affinity to the biologic receptor parallels the increase of their un-overlapped volumes.To better sustain this idea, in Figure 2 there are illustrated two ligand molecules: L 1 that perfectly matches with the binding site of the receptor R, and other, say L 2 , which do not cover the volume V' and additionally producing the V'' volume deformation in the cavity of the binding site receptor R, in (a) and (b), respectively.Since the total steric misfit of the ligand L 2 is proportional with the volume V'+V'' a viable way of quantifying this property of the ligand-receptor interaction provides a steric parameter to be accounted of in a QSAR correlation equation.
To achieve the MTD steric parameters, firstly, worth recalling that, in general, two operation types are used to compare the molecular stereochemistry: • molecular superposition, based on certain common molecular elements or on a "pharmacophoral" constellation" of the atoms, and • defining equivalent positions.
These positions, together with the bonds between them, form a so called "hyperstructure" with both topological and geometrical connotations.
In this context, let's consider N molecules L i (I=1, 2,…, N) with known activities A i for which we wish to obtain a structure-activity correlation equation so that to pose minimum steric/topologic differences MTD i as a quantitative degree of their misfit with the receptor's cavity [15].In short, the MTD method implies the construction and optimization of the so called hypermolecule (H) from where the QSAR equation follows.However, the construction stage requires the superposition of the second or higher period atoms X or XH n , being the hydrogen atoms "contracted" within their bond atoms X.The molecules with fixed orientation are superimposed as atom-over-atom.As a base, an active rigid molecule S will be considered over which all other molecules are superimposed in a maximum way; that is, whenever atoms of different L i give out inter-distances less than 0.5Ǻ, when superimposed or respecting the atoms of the benchmark S molecule, they are considered as belonging to the same vertex j of the hypermolecule H.The resulting vertices j, M 1, j = , represent the approximate positions of the atoms in the R*L i complex, while edges embody the valence bonds between atoms.The activated R*L i complex results from the receptor-ligand interaction: R(aq) + L i (aq where R* is the active conformation of the receptor, induced by the L i ligand binding.Since hydrogen atoms are neglected in the obtained hypermolecule each ligand L i is described by a "vector" {X ij } with binary components (x ij =0, 1).If the vertex j from L i is not occupied, then x ij = 0 otherwise x ij =1.The minimal topological difference MTD i of molecule "i" with respect to the receptor is calculated according to the following formula [11][12][13][14][15][16][17][18]: where ε j = -1, 0 or +1 for vertices attributed to the receptor cavity (beneficial for the biological activity), to the exterior (irrelevant) and to the receptor walls (detrimental), respectively, while s is the total number of cavity vertices (with ε j = -1).In these condition, the term the summation over both the vertices of cavity of R that are not occupied by the ligand L i as well as of those from the wall of R being occupied by the atoms of L i , unless the hydrogen atoms, in equation (1), respectively.As a consequence, the key relation (2) gives the analytical quantification of the steric relation between each ligand "i" of the total tested N respecting the available M vertices of receptor R in the course of bonding.Thus, the degree of steric misfit for the molecule L i with respect to the receptor, i.e.MTD i , is defined as the sum of the number of unoccupied cavity vertices of H and the number of occupied wall vertices of H.If L i shows more low energy conformations, k = 1,…,C i (C i is the number of conformations of molecule I), the conformation that fits best in the receptor site will be retained through the modified version of equation ( 2): In these conditions, the regression equation type of the MTD method looks like: where P i , i= 1,2,…, represent other structural parameters that can be used during the optimization process or after obtaining the optimized receptor chart, say S * , using the corresponding optimized MTD * values.However, the present picture holds under the assumption that the necessary energy to correct the steric misfit shows a linear variation with the corresponding volume misfit.The minus sign in ( 4) is in agreement with the fact that the biological activity decreases with the ligand steric misfit degree.
The optimization procedure starts from an initial } ε { 0 j attribution set, the beginning chart S 0 .Then, one has to calculate the 0 i MTD values and, by employing the multi-linear regression analysis, the regression coefficients in the above equation ( 4) and the associate correlation coefficient 0 r are computed.The algorithm continues with one by one modification of ε j attributions resulting in a number of 2M mono-substituted charts, with the respective MTD i values together with 2M correlation equations of type (4) and their correlation coefficients.The mono-substituted chart with the highest correlation coefficient r from all 2M outputs is taken as starting point for the next cycle.The process continues until the regression coefficient r doesn't increase anymore.The final , represents the optimized receptor chart, S * , and provides through their employment in equation ( 2) the optimized predictor variables MTD * for the QSAR problem [18].

Computational Details
Most actual molecular modeling strategies provide computable data about the analyzed molecular systems, and, also, 3D extended data.The current work summarizes the studies [19] performed upon a series of HEPT derivatives with structural variations in available positions of the uracyl ring, Figure 3. Different substituents in C-5 and C-6 positions on the phenyl ring (bounded by oxygen or a sulphur bridge at C-6) allow for HEPT series to be studied [8,20,21].The recorded effects of various substituents, different from 1-[(2-hydroxiethoxy)-methyl], correlates with the capacity to inhibit the enzyme implied in the inverse transcription of RNA into DNAreverse-transcriptase (RT) enzyme, of which the observed biologic effect stands as the anti-HIV activity.Table I shows the structures of the 79 HEPT derivatives [21] under actual study.

Table I.
The series of structures of the studied HEPT ligand molecules of types of Figure 3. Molecular modeling and conformational analysis were performed using the HyperChem 7.01 program package (MM+ program) [22] with conjugated gradient was employed for the optimization of geometries of the ligands L i of the Table I.
The work conditions are the following: • flexible torsion angles variation: ±60º -±180º; • energetic criteria to accept a conformation: 4 kcal/mol above the minimum; • conformation cancellation: for atoms with distances less than 0.5 Å and torsion angles differences less than 15º; • duplicate conformations: energy differences less than 0.05 kcal/mol;   The superposition of conformers was performed using the HyperChem 7.01 program in order to minimize the sum of square distances between equivalent atoms.As stated before, when the distances between the atoms, after the superposing procedure, were above than 0.5 Å, a new vertex was created.Figure 4 show the superposition of the minimum energy conformations.The obtained hypermolecule has 63 vertices and it is represented in Figure 5.
The beginning chart (S 0 ) was automatically generated with an in house program, considering the weight of "local" molecular activity depending on the occupancy degree of each vertex as in the Table II.
It has the form:
However, the relatively high number of (flexible) torsion angles did not allow applying this method in the multi-conformational variant.Thus, the hypermolecule H was obtained by the superposition of the minimal energy conformations -resulted from the conformational HyperChem analysis -upon the primary heterocyclic common structure of Figure 3 with the result presented in Figures 4 and 5.
The validation of QSAR model was realized using the cross-validation (cv) method.This procedure implies that the series of N ligands L i with the determined activities A i is firstly divided into two subseries, odd and even, corresponding to the successions L 1 , L 3 , L 5 , …, and L 2 , L 4 , L 6 , …, respectively.Then, while the optimized chart S * and the correlation equation is obtained from the odd sub-series the parameters MTD i and the associated activities A i * are evaluated for the molecules of the even subseries.The procedure is completed through the reciprocal consideration, i.e. starting from the even subseries the odd sub-series activities are obtained.Finally, the cross validation correlation index, ( ) through its 2 cv 100 r ⋅ value, approximate the percent from the (experimental) biological activity variance around the average A that can be predicted with the help of the correlation equation ( 4) and of the optimized chart.

Results and Discussions
To analyze QSAR relations of HEPT derivatives with anti-HIV activity, experimental biological activities from Table III were used.The experimental biologic activity is A exp = log (1/ EC 50 ), where EC 50 stands as the level that produces a 50% protection of MT-4 cells against HIV-1 cytopathic effect [8,20,21].
Three type of QSAR are here considered, namely: • the mono-parametric model of hidrophobicity: • the case of linear mono-parametric model where MTD is the steric parameter: • and the combined correlation In equations ( 7) and ( 9), hydrophobicity parameters are calculated using the "QSAR Properties" from the HyperChem [22] program package and they are listed in Table III.MTD i parameters represent the MTD * corresponding to the S * optimized chart of the receptor; α 0 , α 1 and β express the regression parameters corresponding to the obtained QSAR studies.A i is the calculated value of A exp,i .
The optimization process of the hypermolecule allows to optimize the beginning chart of the receptor, S 0 ={ε j } j=1,..., M , using a subsequent changing process of vertices attribution (ε j ) in H, starting with the initial correlation coefficient, r 0 (corresponding to the S 0 beginning chart), until obtaining of a maximal correlation coefficient, r max , corresponding to the receptor's optimized chart, S * ; the r correlation coefficient, on every step of the optimization process, is obtained by minimizing the sum of the squares of differences between experimental A exp,i and calculated A calc,i biological activities: The receptor chart is considered optimized (S * ) when any single new modification ε j does not reduce the value of (10).S * represents a simplified image of R interacting with L i ligands.

Table III. The structural parameters of the HEPT ligands of Table I and their biological activities
against binding with HIV-1 receptor.The hydrophobicities (logP) are derived with the help of HyperChem program [22], while parameters MTD * and MTD * logP are computed with relation (3) upon the optimized charts ( 13) and (15), respectively.The associated calculated activities A logP , A MTD and A MTD+logP are derived from the equations ( 11), ( 12) and ( 14), respectively.The experimental biologic activities (A exp ) are taken from ref. [21].The regression equation regarding the hydrophobicity aspect, i i logP 0.099) 0.790( 0.327) 3.838( A ⋅ ± + ± = (N = 79; r = 0.673; F = 63.85)(11) shows that biological activity increases while the partition octanol/water coefficient increases, indicating that in the studied series the transport of the bioactive compound to the target (receptor site) is important.Still, the corelation coefficient, although significant from the statistical point of view, display a rather modest value.However, this is the indication that steric effects have to be further taken into account either by replacing the hydrofobicity with MTD parameters or by adding them to the correlation equation (11) as will be in next exposed.As a direct comparison, by applying the mono-linear MTD method (8) on the studied series of 79 HEPT anti-HIV derivatives the following statistically validated model was obtained, now with a fine correlation coefficient (r = 0.830): The optimized chart and the vertices attribution of the hypermolecule are shown in Figure 6: white circles represent vertices attributed to the receptor cavity (beneficial), black circles correspond to vertices attributed to receptor walls (detrimental) and the numbered vertices represent those attributed to the exterior (irrelevant).The optimized chart, S * , derived from equation (12) employing the above minimization procedure, see equation (10), looks like: Chart (13) gives information about the stability of vertices attribution in the H hypermolecule and quantifies the place of ligand-receptor binding as (-1) inside the cavity, (+1) inside the walls and (0) corresponding to the steric irrelevant region.
The obtained results using the mono-parametric MTD method (12) prove that the presence of substituents in the R 1 region (C-5 of Figure 3) is proper for interaction only when substituents are of relatively low dimension (methyl, ethyl, isopropyl, cyclopropyl).More voluminous substituents (allyl, propyl, phenethyl, and benzyl) determine the appearance of detrimental vertices influencing negatively the anti-HIV activity.
In the R 2 substituents region (C-6 of Figure 3), phenyl rests interact positively with the receptor site if they are substituted with small radicals in meta positions; methoxi, methoxicarbonyl, nitro and cyano radicals have a detrimental effect.
The most favorable region to interact with the receptor is the R 3 region (N-1 of Figure 3).Single and voluminous substituents bound to the 2-hydroxiethyl rest have a benefic effect on the anti-HIV activity.These substituents, mostly aromatic, interact with a relatively large receptor cavity through hydrophobic (van der Waals) [23] bonds.Radicals in "extreme" regions of the mentioned hydroxiethylic rest lead to detrimental vertices in the region.
Finally, by considering both the hydrophobicity parameter (logP) and the minimal topological difference, a bilinear statistically validated equation is obtained: with a correlation coefficient, r=0.882, significantly improved compared to the simple case of the mono-hydrophobic model (11).This way, the fundamental feature of MTD-QSAR contribution, as enhancing the degree of correlation through accounting of the steric effects, is computationally proved.From equation (14) there is observed that since high values of the MTD parameters characterize the structural and bonding symetries that lead with a dimishment of the biological activity, an opposite situation is met when the parameters are associated with the complementary symetry, see Figure 2 and the relating MTD principle.Therefore, higher biological activity relates with the asymetry degree of the hypermolecule, a feature properly described by the MTD steric quantifications as next discussed.
Using equation ( 14), like before, the optimized MTD parameters chart can be obtained also in this case as:  14) and the optimized chart (15).Now, the vertices attribution in the hypermolecule has been changed, especially in the R 3 substituents region, see Figure 7. Part of the "extreme" vertices of the hydroxiethyl rest became benefic (vertices 14 and 49), while some vertices corresponding to voluminous rests became detrimental (vertices 56 and 59, corresponding to phenyl and cyclohexyl rests).In the mentioned positions it is possible that hydrophobic interactions (which appear when other voluminous rests in the same region are considered) do not exist, whereas hydrophobic interaction favor van der Walls interactions (vertices 53-55, 60 and 61).Most probably, the increase of biological activity simultaneously with the octanol/water partition coefficient increase is due to an improved transport of ligands to the receptor site since the ligand-receptor interaction is characterized through the MTD parameter.
Comprising the hydrophobicity and topological parameters in the MTD analysis it improves significantly the correlation equations qualities, but it does not significantly modify the hypermolecule vertices attribution.The most important change is shown in the R 3 substituents region, where vertices  Multi-parametric models, with structural descriptors of different classes, may lead to correlation coefficients close to 0.9, especially when topological parameters are analyzed with the hydrophobicity parameter; constitutive parameters seem not to improve significantly statistical data [17].

Conclusions
Molecular modeling systems deliver powerful tools in order to build, to observe, to analyze and to keep the obtained molecular systems, from the most simple to the extremely complex biological macromolecules with thousands of atoms.These "tools" are very effective in qualitative (SAR) and quantitative (QSAR) studies.The goal of molecular modeling is not only to explain physical and chemical phenomenon and/or properties, but also to suggest new experiments (meaning in this context new molecular structures).Yet, molecular modeling is not able to give a quantitative prediction upon biological activity (unless few exceptions); however, it can deliver very useful data in order to design new molecular structures with increased bioactive potential [24].
Molecular modeling techniques can be used as well as in direct and indirect drug design (differences are strictly due to scientific knowledge at a given time).Direct drug design implies to know the three-dimensional (3D) receptor properties and their direct employment.Indirect drug design is used when the 3D geometry of the receptor is not known or when the ligand action is not specific or when there is not clear which is the target receptor of a ligand.The method analyses comparatively structural features of known active and inactive compounds, data being integrated in terms of complementarity with a hypothetical model of the biological receptor's site (unknown or with uncertain 3D structure).The first way of SAR and/or QSAR seems more exact but things get more complicated in practice because the receptor topography is determined in other conditions than those in the live body; this means that there appear severe approximations in calculations reducing the value of the obtained conclusions, in many situations after long and expensive studies.The above second method is paradoxically closer to biological realities of receptor-ligand (R-L) interactions, especially when in vivo ligand biological activities are known because they contain all data regarding the interaction process at a molecular level.Therefore, indirect drug design shows an acceptable level of confidence, corresponding to experimental data.
In this context, MTD method is simple and effective to use by allowing quantitative appreciations of chemical structure -biological activity studies and through delivering QSAR models with explanatory and predictive power.This method successfully accompanies the need to have a reliable computational tool to optimize the process of developing new prototype structures (leading compound) and to reduce the necessary costs for a new drug to evolve from a theoretical conception to its appliance in clinical practice.In such, the MTD method consent to identify and to separate the effects that appear in a given biological interaction, to quantify them independently as steric, electronic, hydrophobic, etc. factors and to measure quantitatively biological activity by means of steric correlation analysis.
Since HIV infection still remains a public health issue at a global level, the application of the MTD methods to elucidate the chemical features of the anti-HIV ligand congeners appears as a preeminent actual challenge.This because, unfortunately, the most chemical structure -biological activity studies have been performed only at a qualitative level (SAR) which explains the absence on the market of an "ideal" anti-HIV drug, effective and with less toxic effects.The interaction between bioactive molecules and the RT receptor site was analyzed using the mono-and bi-dimensional MTD method considering only the minimal energy conformations.These conformations were used to obtain the hypermolecule which represents a topological network able to describe the interaction site of the biological receptor with the 79 HEPT ligands.The results through the 79 bioactive compounds indicated the presence of three important regions in order to have biological activity: • the region of R 1 substitute (see Table I) in the common pirymidin-2,4(1H,3H)-dione structure (corresponding to position 5 on this structure) with generally substitutes of small volume (alkyl, arylalkyl); • the region of R 2 substitute in the common structure (position 6) with phenyl-thyo rests unsubstituted or substituted; • the region of R 3 substitute in the common structure (position N-1 in the primary structure of Figure 3) with the highest structural variation.Most of the obtained stable conformations (by means conformational analysis) showed an orientation towards R 3 region of the R 2 radicals; voluminous substituents in the R 3 (phenyl, chlorinephenyl, tolyl, cyclohexyl) indicated an orientation towards R 2 , parallel with the phenyl nucleus from this region, or towards the primary structure in a pseudo-parallel way; this behavior indicates a possible van der Waals interaction between these more or less aromatic rests.There was also concluded that the transport of molecules to the receptor site seems to be very important in the studied series; this is demonstrated via the mono-parametric model, statistically significant, which is obtained using the hydrophobicity (transport) parameter, logP.
Biological activity increases significantly when the logarithm of the octanol/water partition coefficient increases (i.e. the activity increases by almost 1 logarithmic unit when this parameter increases by 1).
Afterall, through considering the MTD method leaves with the possibility of obtaining essential stereochemical data upon the structure of bioactive ligands, pointing out structural requirements necessary to have an improved and significant biological activity.Further 3D-QSAR studies are envisaged targeting the possibility to know the structural requirements for a compound to have an increased anti-HIV activity.

Figure 2 .
Figure 2. Schematic illustration of the MTD principle, namely the biological activity is proportional with the binding degree between the ligands (L 1 , L 2 ) and the receptor R: in (a) as perfect matching and in (b) as the misfit relating the total volume V'+V''.

Figure 3 .
Figure 3.The envisaged positions for substituents of the uracyl ring.

Figure 4 .
Figure 4. Superposition of the minimum energy conformations of the 79 ligands from TableI.

Figure 5 .
Figure 5. Hypermolecule H obtained by superposition of the minimum energy conformations of the 79 HEPT derivatives (M = 63 vertices).

Figure 6 .
Figure 6.The occupancy of the hypermolecule vertices within the linear mono-parametric MTD model given by equation (12) and the optimized chart (13).

Figure 7 .
Figure 7.The occupancy of the hypermolecule vertices within the bilinear MTD model given by equation (14) and the optimized chart (15).
Conformational analysis allowed the most stable conformations (with the lowest energy level) for each of the 79 HEPT ligands of TableI.The 79 minimal energy conformations optimized with the MM+ program were superposed on the cyclic structure: pyrimidine-2,4(1H,3H)-dione.

Table II .
Vertices occupancy in hypermolecule H of Figure5by the ligands of TableI.
rests close to R 2 region (including those corresponding to substituents in o, m, p of phenyl in R 2 region) become detrimental for interaction.