1. Introduction
Serotonin receptor 5-HT
7 (5-HT
7R) is a representative of the G protein-coupled receptor s (GPCRs)—the largest and the most diverse group of proteins in the human genome. The endogenous ligand of serotonin receptor 5-HT
7 (serotonin) plays important functions in the organism, such as regulation of mood, sleep, temperature, appetite and other physiological processes, and therefore the 5-HT
7R constitutes an important drug target for a wide range of disorders [
1].
There are numerous ligands targeting 5-HT
7R (over 3500 records referring to this receptor present in the ChEMBL database [
2]); however, the existing drugs which modulate the activity of this receptor still possess numerous limitations due to the side effects related with their use. Therefore, the search for new ligands of 5-HT
7R with optimized safety profile is highly desirable.
One of the very important parameters which should be optimized during the search for new drugs is metabolic stability. It is a very important compound parameter, as a molecule needs to stay in the receptor binding site for a sufficient time in its unchanged form in order to trigger the desired biological response [
3]. On the other hand, metabolic stability is influenced by a number of factors and it is a very complex phenomenon; therefore, its computational predictions are difficult and the accuracy of already existing approaches are often insufficient.
In this study, we constructed a set of machine learning (ML) models for evaluation of compound metabolic stability. It is part of the bigger ADMET platform, which will include the following properties: solubility, metabolic stability, biological membrane permeability, hERG channel blocking, and mutagenicity. The properties are assessed in both a ligand- and structure-based (where possible) manner using 1- and 2-dimensional descriptors, key-based fingerprints and structural interaction fingerprints.
2. Methods
Here, we present the outcome of the predictive models obtained for the metabolic stability. The models were constructed on the data fetched from the ChEMBL database [
2], referring to the compound half-lifetime (we used human, mouse, and rat data) and separate models for each dataset were prepared. For compound representation, we used 1- and 2-dimensional descriptors from the PaDEL-Descriptor [
4] and Extended Fingerprinter (ExtFP) from the same software package. Six algorithms were used as predictive models–SMOreg, SMO, k-nearest neighbour algorithm (IBk), Naïve Bayes and Random Forest. In addition, we developed a methodology of providing the structure of the most similar compounds from the training set together with half-lifetime predictions, so as the obtained results can be manually validated and the possible influence of the particular compound substructures on the obtained results can be examined.
3. Results
In the classification studies, we divided the data into three stability classes using the following thresholds for half-lifetime values: ≤0.6–low, (0.6–2.32 > –medium, >2.32–high. The outcome of evaluating parameters obtained in 10-fold cross-validation studies are gathered in
Table 1.
The results indicate that the constructed tools are capable of predicting metabolic stability with good accuracy. There is no direct preference for compound representation, as both 1d2d descriptors and ExtFP provided prediction accuracy at similar levels, and the accuracy depended on the ML algorithm applied. SMO, IBk and Random Forest were algorithms which provided high prediction accuracy regardless of compound representation (over 0.7 in all considered cases).
4. Conclusions
In summary, the platform for ADMET parameters of compounds is being developed. On the example of metabolic stability assessment, the methods proved its validity and usefulness in these types of tasks. In addition, the compound representations and ML algorithms provide highly accurate predictions regardless of other conditions. The developed tools will constitute a great support for medicinal chemists, enabling the instant in silico evaluation of a compound selected for synthesis and experimental evaluation.
Author Contributions
Conceptualization, S.P.; methodology, S.P. and R.K.; software, R.K.; validation, S.P. and R.K.; investigation, S.P. and R.K.; data curation, S.P. and R.K.; writing—original draft preparation, S.P.; writing—review and editing, S.P.; supervision, S.P.; project administration, S.P.; funding acquisition, S.P. All authors have read and agreed to the published version of the manuscript.
Funding
The study was supported by the grant OPUS 2018/31/B/NZ2/00165 financed by the National Science Centre, Poland (
www.ncn.gov.pl, accessed on 22 September 2022).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Nichols, D.E.; Nichols, C.D. Serotonin receptors. Chem. Rev. 2008, 108, 1614–1641. [Google Scholar] [CrossRef] [PubMed]
- Gaulton, A.; Bellis, L.J.; Bento, A.P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100–D1107. [Google Scholar] [CrossRef]
- Słoczyńska, K.; Gunia-Krzyżak, A.; Koczurkiewicz, P.; Wójcik-Pszczoła, K.; Żelaszczyk, D.; Popiół, J.; Pękala, E. Metabolic stability and its role in the discovery of new chemical entities. Acta Pharm. 2019, 69, 345–361. [Google Scholar] [CrossRef]
- Yap, C.W. PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints. J. Comput. Chem. 2011, 32, 1466–1474. [Google Scholar] [CrossRef] [PubMed]
Table 1.
Evaluating parameter values obtained in 10-fold cross-validation studies.
Table 1.
Evaluating parameter values obtained in 10-fold cross-validation studies.
| | | 1d2d Descriptors | ExtFP |
---|
| | Class | SMOreg | SMO | IBk | Naïve Bayes | Random Forest | J48 | SMOreg | SMO | IBk | Naïve Bayes | Random Forest | J48 |
---|
human | Overall accuracy | | 0.524 | 0.739 | 0.72 | 0.517 | 0.726 | 0.66 | 0.698 | 0.725 | 0.711 | 0.571 | 0.728 | 0.682 |
AUROC | | | 0.836 | 0.8 | 0.708 | 0.886 | 0.7333 | | 0.821 | 0.792 | 0.757 | 0.881 | 0.781 |
rat | Overall accuracy | | 0.553 | 0.777 | 0.775 | 0.566 | 0.767 | 0.704 | 0.528 | 0.771 | 0.754 | 0.657 | 0.762 | 0.718 |
AUROC | | | 0.819 | 0.813 | 0.698 | 0.912 | 0.773 | | 0.817 | 0.87 | 0.821 | 0.906 | 0.774 |
mouse | Overall accuracy | | 0.632 | 0.743 | 0.736 | 0.533 | 0.73 | 0.667 | 0.696 | 0.751 | 0.737 | 0.665 | 0.743 | 0.686 |
AUROC | | | 0.753 | 0.781 | 0.673 | 0.872 | 0.729 | | 0.776 | 0.846 | 0.809 | 0.848 | 0.742 |
| Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).