Virtual Prospection of Marine Cyclopeptides as Therapeutics by Means of Conceptual DFT and Computational ADMET

Bioactive peptides are chemical compounds created through the covalent bonding of amino acids, known as amide or peptide bonds. Due to their unusual chemistry and various biological effects, marine bioactive peptides have garnered considerable research. The effectiveness of a bioactive marine peptide is attributed to its structural features, such as amino acid content and sequence, which vary depending on the degree of action. Cyclic peptides combine several favorable properties such as good binding affinity, target selectivity and low toxicity that render them an attractive modality for the development of therapeutics. The apratoxins are a class of molecules formed by a series of cyclic depsipeptides with potent cytotoxic activities. The objective of this research is to pursue a computational prospection of the molecular structures and properties of several cylopeptides of marine origin with potential therapeutic applications. The methodology will be based on the determination of the chemical reactivity descriptors of the studied molecules through the consideration of the Conceptual DFT model and validation of a particular model chemistry, MN12SX/Def2TZVP/H2O. These studies will be complemented by a determination of the pharmacokinetics and ADMET parameters by resorting to certain cheminformatics tools.


Introduction
The chemical compounds known as bioactive peptides are created by amino acids joining together through covalent connections, which are known as amide or peptide bonds. Although some bioactive peptides can be found in their natural state, the vast majority of known bioactive peptides are encoded in the structure of the parent proteins and are only released through enzymatic activities, which account for the majority of their release. Due to their unusual chemistry and various biological effects, marine bioactive peptides have garnered considerable research. A bioactive marine peptide's effectiveness is attributed to its structural features, such as amino acid content and sequence, which vary depending on the degree of action [1][2][3].
Computational Medicinal Chemistry (CMC) has established itself as a critical component of contemporary drug discovery. The goal of CMC is not to replace in vitro or in vivo tests, but rather to accelerate the discovery process, reduce the number of candidates that must be evaluated experimentally, and rationalize the selection of those candidates that are investigated [4,5]. Density Functional Theory (DFT) represents a computational methodology that has gained popularity in recent years for calculating molecular properties. DFT has been extensively used to compute the electronic structure and properties of molecules in both the ground and excited electronic states and in both the gas and aqueous phases. Conceptual DFT (CDFT), as a specialized branch of DFT, was created to develop a chemical reactivity theory on the basis of certain chemical concepts that arise from DFT by expressing their properties in terms of reactivity descriptors. It has been successfully used for the study of the chemical reactivity of atoms, molecules, and organic and metallic clusters [6][7][8][9][10][11][12][13][14][15][16][17].
There are several other critical parameters in the development of new drug molecules that contribute to the definition of overall safety margins, dose intervals, and dose amounts. These parameters include absorption, distribution, metabolism, excretion, and toxicology (ADMET). The comprehensive investigation of physicochemical parameters is crucial as medicinal chemists can easily draw links to such physicochemical parameters. When developing a new medicine, the optimal balance of physicochemical qualities and ADMET criteria must be achieved. The research related to these properties using Computational Chemistry and Molecular Modeling techniques has been labelled as Computational AD-MET and is being increasingly considered within the context of drug discovery and design (http://crdd.osdd.net/admet.php (accessed on 22 February 2022)).
The objective of this research is to pursue a computational prospection of the molecular structure and properties of several cylopeptides of marine origin with potential therapeutic applications. Cyclic peptides combine several favorable properties such as good binding affinity, target selectivity and low toxicity that make them an attractive modality for the development of therapeutics. The apratoxins represent a class of molecules formed by a series of cyclic depsipeptides with potent cytotoxic activities, usually in the nanomolar range, and characterized by a thiazoline unit and an extensive polyketide-derived moiety as part of the macrocyclic5 structure [18]. Apratoxins A-C are cyclodepsipeptides isolated from Lyngbya majuscula with in vitro cytotoxicity against LoVo and KB cell-lines [19]. Apratoxin D is a potent cytotoxic cyclodepsipeptide from Papua New Guinea collections of the marine cyanobacteria Lyngbya majuscula and Lyngbya sordida [19]. The peptide apratoxin E was also isolated from L. bouillonii, and was found to bear superior cytotoxicity when compared to its closest analog, the semi-synthetic E-dehydroapratoxin A, against various cancer cell-lines derived from bone, colon, and cervix, with ranging values of activity but less active when compared to Apratoxin A [20]. Apratoxins F and G are two cytotoxic cyclic depsipeptides which were isolated from Lyngbya bouillonii collected from Palmyra, characterized by the presence of an N-methyl alanine residue at a position where earlier Apratoxins contained a proline unit [21,22]. Graphical sketches of the molecular structures of the Apratoxins A-G as retrieved from the PubChem database (https://pubchem.ncbi.nlm.nih.gov (accessed on 22 February 2022)) are shown in Figure 1. The methodology will be based on the determination of the chemical reactivity descriptors of the studied molecules through the consideration of the Conceptual DFT model and validation of a particular model chemistry, MN12SX/Def2TZVP/H 2 O, through an earlier proposed procedure [23][24][25][26][27]. These studies will be complemented by a determination of the pharmacokinetics and ADMET parameters by resorting to certain chemoinformatics tools.

Results and Discussion
The Kohn-Sham (KS) methodology includes the determination of the molecular energy, electronic density and orbital energies of a given system, in particular, the frontier orbitals HOMO and LUMO which are intrinsically related to the chemical reactivity of molecules [28][29][30][31]. Through the comparison of a density functional's results to experimental values or high-level computations, one can assess the quality of the density functional. A methodology referred to as KID was developed by our research group [23][24][25][26] to avoid these comparisons and to validate the ability of a given density functional in the fulfilment of the Janak and the Ionization Energy Theorems [32][33][34][35][36]. This connection between H to −I and L to −A, is verified through the formulas J I = H + E gs (N − 1) − E gs (N), J A = L + E gs (N) − E gs (N + 1), and J HL = J I 2 + J A 2 , with H and L representing the HOMO and LUMO energies related to the marine cyclopentapeptides considered in this research. An extra KID descriptor ∆SL, equal to the difference in energies between the SOMO (corresponding to the radical anion's HOMO) and the neutral system's LUMO, was devised to aid in the verification of this methodology's accuracy [23][24][25][26]. The results for these calculations are presented in Table 1 while the corresponding optimized molecular structures are displayed in Figure 2.  As can be appreciated by the inspection of Table 1, the use of the MN12SX/Def2TZVP/H 2 O model chemistry is justified since the KID procedure shows the fulfillment of the Janak and Ionization Energy theorems for all molecular systems considered in this research.
This methodology is convenient when considering quantitative qualities related to Conceptual DFT descriptors [6][7][8][9]37,38]. The definitions for the global reactivity descriptors are [6][7][8][9]37,38]: Electronegativity as These global reactivity descriptors that arise from Conceptual DFT [6][7][8][9]37,38], have been complemented by the Nucleophilicity Index N [39][40][41][42][43]. The results for the determination of the Conceptual DFT reactivity descriptors for the selected peptides are displayed in Table 2. Analysis of the Conceptual DFT descriptors reveals some information about the stability, electrophilicity, and nucleophilicity of the compounds under investigation. It can be appreciated form Tables 1 and 2, that Apratoxin A exhibits the largest global hardness (or HOMO-LUMO gap) among the molecules. Thus, it will be the least reactive of the studied peptides, and in turn, Apratoxin C will be the most reactive. Apratoxins B and G display the highest electronic chemical potentials and can efficiently exchange electron density with the environment. By studying the electrophilicity of a series of reagents involved in Diels-Alder reactions [41,44,45], an electrophilicity ω scale for the classification of organic molecules as strong, moderate or marginal electrophiles was proposed with ω > 1.5 eV for the first case, 0.8 < ω < 1.5 eV for the second case and ω < 0.8 eV for the last case [41,44,45]. By inspection of Table 2, it can be observed that with the exception of Aparatoxin A, the other peptides may be regarded as moderate electrophiles. Organic molecules can be classified as strong (N > 3 eV), moderate (2.0 eV ≤ N ≤ 3.0 eV), and marginal nucleophiles (N < 2.0 eV) in polar organic reactions [40]. From Table 2, it can be concluded that all peptides under investigation may be considered moderate nucleophiles.
A simple QSAR relationship pKa = 16.3088 − 0.8268 η was employed to determine the pKa of cyclopeptides using the methods provided earlier, which has proved useful in the research of amino acids and short peptides, as well as in the creation of Advanced Glycation End Products (AGEs) inhibitors [46]. These results, together with some pharmacokinetic parameters of utility in the process of drug design and discovery, are reported in Table 3. Although this research deals with the use and validation of certain computational techniques applied in the determination of the chemical reactivity properties of the studied molecules, it would be desirable to identify some correlation between the Conceptual DFT descriptors and the pharmacokinetics and ADMET indices, as demonstrated for the case regarding pKas. However, there is no sense in identifying QSAR relationships when working with only seven molecules. Some qualitative correlations can be mentioned instead. For example, it can be seen from Table 3, that Apratoxin D exhibits the largest value of logP, but is also among the peptides with the lowest values of electrophilicity ω ( Table 2). This opens the way for identifying QSAR relationships in future studies with a larger number of molecular systems.
The Dual Descriptor DD is a local reactivity descriptor defined as DD = (∂ f (r)/ ∂ N) υ(r) [10,[47][48][49][50][51]. A molecule's nucleophilic and electrophilic sites can be defined using the Dual Descriptor DD without any confusion [51]. To further comprehend the local chemical reactivity of Apratoxins A-G, the Dual Descriptor DD for these compounds is shown graphically in Figure 3. The estimated Bioactivity Scores of the Apratoxins A-G family of marine cyclopeptides are displayed in Table 4. The interpretation of the presented results indicates that all members of the Apratoxin family (with the exception of F and G) will behave as protease inhibitors. Furthermore, Apratoxins A, B, C and E could act as GPCR ligands. The computed ADMET pharmacokinetic profiles of the Apratoxins A-G family of marine cyclopeptides are presented in Table 5.
As the information presented is given in terms of positive and negative descriptors, and through numerical values, it is not possible to identify QSAR relationships between the ADMET properties and the Conceptual DFT reactivity descriptors. From Table 5, it can be appreciated that all the members of the Apratoxin family of cyclopeptides will display good gastrointestinal absorption but not BBB permeability. All the molecules could behave as P-gp substrates and also act as P-gp I inhibitors and with the exception of Apratoxin E, as P-gp II inhibitors. Their behavior relating to the different variants of cytochrome p450 will differ as displayed in Table 4 with some particularities for each of the variants. AMES toxicity will not occur for any of the studied peptides, although hepatoxicity will be displayed. All peptides will exhibit negative behavior as inhibitors of hERG I and II, with the exception of Apratoxin F regarding hERG II inhibition. Finally, none of the cyclopeptides will generate skin sensitization.

Conceptual DFT Studies
The main approach of our research is based on the application of Conceptual DFT [6][7][8][9][10][11][12] for the prediction of the chemical reactivity properties of the studied molecules. The starting point is the calculation of their fundamental molecular structures determining the electronic densities and from these the corresponding molecular and orbital energies, mainly the Highest Occupied Molecular Orbital (HOMO) and the Lowest Unoccupied Molecular Orbital (LUMO). As usual, the many conformers of the studied peptides will be predicted considering the MarvinView 17.15 software from ChemAxon (http://www.chemaxon.com (accessed on 22 February 2022)). This will be achieved with the help of the MMFF94 force field for performing Molecular Mechanics calculations. Every selected conformer for each peptide will be subject to a geometry optimization and frequency calculation by means of the Density Functional Tight Binding (DFTBA) methodology [52] for the obtention of suitable starting molecular structures. This will be followed by geometry reoptimization, frequency analysis and calculation of the electronic properties and the chemical reactivity descriptors of the cyclopeptides considering the MN12SX/Def2TZVP/H 2 O model chemistry [53][54][55] within the context of the Kohn-Sham (KS) approach [28][29][30][31]. The absence of imaginary frequencies will be checked as a guarantee that the optimized structures may be considered as minima within the energy landscape. Gaussian 16 software [52] and the SMD solvation model [56] will be considerd owing to the fact that the chemistry of the chosen model was previously proved to fulfil the 'Koopmans in DFT' (KID) procedure [23][24][25][26]. This methodology is useful to verify whether a given density functional behaves according to the Janak and Ionization Energy theorems. It has been previously shown [27] that while in the absence of a solvent, the calculations performed with the ωB97XD density functional fulfil these theorems, when in the presence of water as a solvent, the performance of the MN12SX density functional is much better.

Computational ADMET
It is crucial to understand pharmacokinetics, or the fate of a given molecule in the body, during the drug research and design process. This can be estimated in terms of individual indices collectively known as ADMET (absorption, distribution, metabolism, excretion, and toxicity). Computer models are frequently used as an alternative to experimental methods for establishing these parameters. Chemicalize, a software developed by ChemAxon (http: //www.chemaxon.com (accessed on 22 February 2022)), was considered for this purpose, while more involved additional information regarding pharmacokinetic parameters and ADMET properties were obtained using admetSAR [57], a software for the prediction of these properties using SMILES (http://lmmd.ecust.edu.cn/admetsar2/ accessed on (22 February 2022)). Molinspiration software (https://www.molinspiration.com/ (accessed on 22 February 2022)) was used to compute numerous molecular characteristics and forecast the bioactivity scores for pharmacological targets such as enzymes and nuclear receptors, kinase inhibitors, GPCR ligands, and ion channel modulators.

Conclusions
Through the combination of a methodology based on the estimation of Conceptual DFT reactivity descriptors, a procedure for validating the fulfilment of the Janak and Ionization Energy theorems, and certain informatics tools aiding in the estimation of pharmacokinetic parameters and ADMET indices, information regarding the potential therapeutic properties of a family of cyclopeptides of marine origin has been reported. The results could provide the basis and starting point for future studies on experimental and clinical research concerning these interesting molecules. These conclusions pave the way for considering these Conceptual DFT reactivity indices as descriptors of bioactivity in future studies employing a larger number of potential therapeutic drugs.