Classification of MLH1 Missense VUS Using Protein Structure-Based Deep Learning-Ramachandran Plot-Molecular Dynamics Simulations Method
Abstract
:1. Introduction
2. Results
2.1. Construct Mutant MLH1 Structures
2.2. Use Wildtype Structure to Compensate for Limited Benign Structures
2.3. Tuning Parameter of DL-RP-MDS
2.4. Classification of VUS Variants by DL-RP-MDS
3. Discussion
4. Materials and Methods
4.1. Source of Missense Variants
4.2. Molecular Dynamics Simulation
4.3. Deep Learning Ramachandran Plot Molecular Dynamic Simulation
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Dai, J.; Sanchez, A.; Adam, C.; Ranjha, L.; Reginato, G.; Chervy, P.; Tellier-Lebegue, C.; Andreani, J.; Guérois, R.; Ropars, V.; et al. Molecular basis of the dual role of the Mlh1-Mlh3 endonuclease in MMR and in meiotic crossover formation. Proc. Natl. Acad. Sci. USA 2021, 118, e2022704118. [Google Scholar] [CrossRef] [PubMed]
- Jia, P.; Chai, W. The MLH1 ATPase domain is needed for suppressing aberrant formation of interstitial telomeric sequences. DNA Repair 2018, 65, 20–25. [Google Scholar] [CrossRef] [PubMed]
- Ryan, N.A.J.; Glaire, M.A.; Blake, D.; Cabrera-Dandy, M.; Evans, D.G.; Crosbie, E.J. The proportion of endometrial cancers associated with Lynch syndrome: A systematic review of the literature and meta-analysis. Genet. Med. 2019, 21, 2167–2180. [Google Scholar] [CrossRef] [PubMed]
- Lynch, H.T.; de la Chapelle, A. Hereditary Colorectal Cancer. N. Engl. J. Med. 2003, 348, 919–932. [Google Scholar] [CrossRef] [PubMed]
- Dusic, E.J.; Theoryn, T.; Wang, C.; Swisher, E.M.; Bowen, D.J.; EDGE Study Team. Barriers, interventions, and recommendations: Improving the genetic testing landscape. Front. Digit. Health 2022, 4, 961128. [Google Scholar] [CrossRef]
- Jasin, M. Homologous repair of DNA damage and tumorigenesis:the BRCA connection. Oncogene 2002, 21, 8981–8993. [Google Scholar] [CrossRef]
- Lindor, N.M.; Guidugli, L.; Wang, X.; Vallée, M.P.; Monteiro, A.N.; Tavtigian, S.; Goldgar, D.E.; Couch, F.J. A review of a multifactorial probability-based model for classification of BRCA1 and BRCA2 variants of uncertain significance (VUS). Hum. Mutat. 2012, 33, 8–21. [Google Scholar] [CrossRef]
- McCoy, M.D.; Madhavan, S.; Nimmagadda, S.; Klimov, D.; Jafri, M.S. Translational Applications of Protein Structure Simulation: Predicting Phenotype of Missense Variants. Biophys. J. 2019, 116, 13A. [Google Scholar] [CrossRef]
- Woods, N.T.; Baskin, R.; Golubeva, V.; Jhuraney, A.; De-Gregoriis, G.; Vaclova, T.; Goldgar, D.E.; Couch, F.J.; Carvalho, M.A.; Iversen, E.S.; et al. Functional assays provide a robust tool for the clinical annotation of genetic variants of uncertain significance. NPJ Genom. Med. 2016, 1, 16001. [Google Scholar] [CrossRef]
- Mohammadi, L.; Vreeswijk, M.P.; Oldenburg, R.; van den Ouweland, A.; Oosterwijk, J.C.; van der Hout, A.H.; Hoogerbrugge, N.; Ligtenberg, M.; Ausems, M.G.; van der Luijt, R.B.; et al. A simple method for co-segregation analysis to evaluate the pathogenicity of unclassified variants; BRCA1 and BRCA2 as an example. BMC Cancer 2009, 9, 211. [Google Scholar] [CrossRef]
- Vaser, R.; Adusumalli, S.; Leng, S.N.; Sikic, M.; Ng, P.C. SIFT missense predictions for genomes. Nat. Protoc. 2016, 11, 1–9. [Google Scholar] [CrossRef]
- Adzhubei, I.; Jordan, D.M.; Sunyaev, S.R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. 2013, 76, 7.20.1–7.20.41. [Google Scholar] [CrossRef] [PubMed]
- McCoy, M.D.; Hamre, J.; Klimov, D.K.; Jafri, M.S. Predicting Genetic Variation Severity Using Machine Learning to Interpret Molecular Simulations. Biophys. J. 2021, 120, 189–204. [Google Scholar] [CrossRef] [PubMed]
- Livesey, B.J.; Marsh, J.A. Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations. Mol. Syst. Biol. 2020, 16, e9380. [Google Scholar] [CrossRef] [PubMed]
- Ioannidis, N.M.; Rothstein, J.H.; Pejaver, V.; Middha, S.; McDonnell, S.K.; Baheti, S.; Musolf, A.; Li, Q.; Holzinger, E.; Karyadi, D.; et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am. J. Hum. Genet. 2016, 99, 877–885. [Google Scholar] [CrossRef]
- Richards, S.; Aziz, N.; Bale, S.; Bick, D.; Das, S.; Gastier-Foster, J.; Grody, W.W.; Hegde, M.; Lyon, E.; Spector, E.; et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015, 17, 405–424. [Google Scholar] [CrossRef]
- Cubuk, C.; Garrett, A.; Choi, S.; King, L.; Loveday, C.; Torr, B.; Burghel, G.J.; Durkie, M.; Callaway, A.; Robinson, R.; et al. Clinical likelihood ratios and balanced accuracy for 44 in silico tools against multiple large-scale functional assays of cancer susceptibility genes. Genet. Med. 2021, 23, 2096–2104. [Google Scholar] [CrossRef]
- Ernst, C.; Hahnen, E.; Engel, C.; Nothnagel, M.; Weber, J.; Schmutzler, R.K.; Hauke, J. Performance of in silico prediction tools for the classification of rare BRCA1/2 missense variants in clinical diagnostics. BMC Med. Genom. 2018, 11, 35. [Google Scholar] [CrossRef]
- Martí-Renom, M.A.; Stuart, A.C.; Fiser, A.; Sánchez, R.; Melo, F.; Šali, A. Comparative Protein Structure Modeling of Genes and Genomes. Annu. Rev. Biophys. Biomol. Struct. 2000, 29, 291–325. [Google Scholar] [CrossRef]
- Skolnick, J.; Fetrow, J.S. From genes to protein structure and function: Novel applications of computational approaches in the genomic era. Trends Biotechnol. 2000, 18, 34–39. [Google Scholar] [CrossRef]
- Studer, R.A.; Dessailly, B.H.; Orengo, C.A. Residue mutations and their impact on protein structure and function: Detecting beneficial and pathogenic changes. Biochem. J. 2013, 449, 581–594. [Google Scholar] [CrossRef] [PubMed]
- Tam, B.; Sinha, S.; Qin, Z.; Wang, S.M. Comprehensive Identification of Deleterious TP53 Missense VUS Variants Based on Their Impact on TP53 Structural Stability. Int. J. Mol. Sci. 2021, 22, 11345. [Google Scholar] [CrossRef] [PubMed]
- Tam, B.; Sinha, S.; Wang, S.M. Combining Ramachandran plot and molecular dynamics simulation for structural-based variant classification: Using TP53 variants as model. Comput. Struct. Biotechnol. 2020, 18, 4033–4039. [Google Scholar] [CrossRef] [PubMed]
- Park, S.W.; Lee, B.H.; Song, S.H.; Kim, M.K. Revisiting the Ramachandran plot based on statistical analysis of static and dynamic characteristics of protein structures. J. Struct. Biol. 2023, 215, 107939. [Google Scholar] [CrossRef] [PubMed]
- Salsbury, F.R. Molecular dynamics simulations of protein dynamics and their relevance to drug discovery. Curr. Opin. Pharmacol. 2010, 10, 738–744. [Google Scholar] [CrossRef]
- Sora, V.; Laspiur, A.O.; Degn, K.; Arnaudi, M.; Utichi, M.; Beltrame, L.; De Menezes, D.; Orlandi, M.; Stoltze, U.K.; Rigina, O.; et al. RosettaDDGPrediction for high-throughput mutational scans: From stability to binding. Protein Sci. 2023, 32, e4527. [Google Scholar] [CrossRef]
- Genheden, S.; Ryde, U. The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities. Expert Opin. Drug Discov. 2015, 10, 449–461. [Google Scholar] [CrossRef]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Internal Representations by Error Propagation; California University San Diego La Jolla Inst for Cognitive Science: La Jolla, CA, USA, 1985. [Google Scholar]
- Kramer, M.A. Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 1991, 37, 233–243. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Tam, B.; Qin, Z.; Zhao, B.; Wang, S.M.; Lei, C.L. Integration of deep learning with Ramachandran plot molecular dynamics simulation for genetic variant classification. iScience 2023, 26, 106122. [Google Scholar] [CrossRef]
- Wu, H.; Zeng, H.; Lam, R.; Tempel, W.; Kerr, I.D.; Min, J. Structure of the human MLH1 N-terminus: Implications for predisposition to Lynch syndrome. Acta Crystallogr. F Struct. Biol. Commun. 2015, 71, 981–985. [Google Scholar] [CrossRef]
- Šali, A.; Blundell, T.L. Comparative Protein Modelling by Satisfaction of Spatial Restraints. J. Mol. Biol. 1993, 234, 779–815. [Google Scholar] [CrossRef]
- Pettersen, E.F.; Goddard, T.D.; Huang, C.C.; Couch, G.S.; Greenblatt, D.M.; Meng, E.C.; Ferrin, T.E. UCSF Chimera—A visualization system for exploratory research and analysis. J. Comput. Chem. 2004, 25, 1605–1612. [Google Scholar] [CrossRef]
- Shapovalov, M.V.; Dunbrack, R.L., Jr. A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure 2011, 19, 844–858. [Google Scholar] [CrossRef]
- Tavtigian, S.V.; Greenblatt, M.S.; Lesueur, F.; Byrnes, G.B. In silico analysis of missense substitutions using sequence-alignment based methods. Hum. Mutat. 2008, 29, 1327–1336. [Google Scholar] [CrossRef]
- Tsai, G.J.; Rañola, J.M.O.; Smith, C.; Garrett, L.T.; Bergquist, T.; Casadei, S.; Bowen, D.J.; Shirts, B.H. Outcomes of 92 patient-driven family studies for reclassification of variants of uncertain significance. Genet. Med. 2019, 21, 1435–1442. [Google Scholar] [CrossRef]
- So, M.-K.; Jeong, T.-D.; Lim, W.; Moon, B.-I.; Paik, N.S.; Kim, S.C.; Huh, J. Reinterpretation of BRCA1 and BRCA2 variants of uncertain significance in patients with hereditary breast/ovarian cancer using the ACMG/AMP 2015 guidelines. Breast Cancer 2019, 26, 510–519. [Google Scholar] [CrossRef]
- Primakoff, H.; Holstein, T. Many-Body Interactions in Atomic and Nuclear Systems. Phys. Rev. 1939, 55, 1218–1234. [Google Scholar] [CrossRef]
- Wehmeyer, C.; Noé, F. Time-lagged autoencoders: Deep learning of slow collective variables for molecular kinetics. J. Chem. Phys. 2018, 148, 241703. [Google Scholar] [CrossRef]
- Ponzoni, L.; Bahar, I. Structural dynamics is a determinant of the functional significance of missense variants. Proc. Natl. Acad. Sci. USA 2018, 115, 4164–4169. [Google Scholar] [CrossRef]
- Kou, S.H.; Li, J.; Tam, B.; Lei, H.; Zhao, B.; Xiao, F.; Wang, S.M. TP53 germline pathogenic variants in modern humans were likely originated during recent human history. NAR Cancer 2023, 5, zcad025. [Google Scholar] [CrossRef] [PubMed]
- Qin, Z.; Li, J.; Tam, B.; Sinha, S.; Zhao, B.; Bhaskaran, S.P.; Huang, T.; Wu, X.; Chian, J.S.; Guo, M.; et al. Ethnic-specificity, evolution origin and deleteriousness of Asian BRCA variation revealed by over 7500 BRCA variants derived from Asian population. Int. J. Cancer 2023, 152, 1159–1173. [Google Scholar] [CrossRef]
- Xiao, F.; Li, J.; Lagniton, P.N.P.; Kou, S.H.; Lei, H.; Tam, B.; Wang, S.M. Evolutionary Origin of MUTYH Germline Pathogenic Variations in Modern Humans. Biomolecules 2023, 13, 429. [Google Scholar] [CrossRef] [PubMed]
- Altis, A.; Otten, M.; Nguyen, P.H.; Hegger, R.; Stock, G. Construction of the free energy landscape of biomolecules via dihedral angle principal component analysis. J. Chem. Phys. 2008, 128, 245102. [Google Scholar] [CrossRef] [PubMed]
- Sittel, F.; Jain, A.; Stock, G. Principal component analysis of molecular dynamics: On the use of Cartesian vs. internal coordinates. J. Chem. Phys. 2014, 141, 014111. [Google Scholar] [CrossRef] [PubMed]
- Riccardi, L.; Nguyen, P.H.; Stock, G. Free-Energy Landscape of RNA Hairpins Constructed via Dihedral Angle Principal Component Analysis. J. Phys. Chem. B 2009, 113, 16660–16668. [Google Scholar] [CrossRef]
- Pereira, J.; Simpkin, A.J.; Hartmann, M.D.; Rigden, D.J.; Keegan, R.M.; Lupas, A.N. High-accuracy protein structure prediction in CASP14. Proteins Struct. Funct. Bioinform. 2021, 89, 1687–1699. [Google Scholar] [CrossRef]
- Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
- Lin, Z.; Akin, H.; Rao, R.; Hie, B.; Zhu, Z.; Lu, W.; Smetanin, N.; Verkuil, R.; Kabeli, O.; Shmueli, Y.; et al. Evolutionary-scale prediction of atomic level protein structure with a language model. Science 2022, 379, 1123–1130. [Google Scholar] [CrossRef]
- Yang, J.; Yan, R.; Roy, A.; Xu, D.; Poisson, J.; Zhang, Y. The I-TASSER Suite: Protein structure and function prediction. Nat. Methods 2015, 12, 7–8. [Google Scholar] [CrossRef]
- Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. 2019, 1, 206–215. [Google Scholar] [CrossRef] [PubMed]
- Esposito, D.; Weile, J.; Shendure, J.; Starita, L.M.; Papenfuss, A.T.; Roth, F.P.; Fowler, D.M.; Rubin, A.F. MaveDB: An open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 2019, 20, 223. [Google Scholar] [CrossRef] [PubMed]
- Jia, X.; Burugula, B.B.; Chen, V.; Lemons, R.M.; Jayakody, S.; Maksutova, M.; Kitzman, J.O. Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk. Am. J. Hum. Genet. 2021, 108, 163–175. [Google Scholar] [CrossRef] [PubMed]
- Berendsen, H.J.C.; van der Spoel, D.; van Drunen, R. GROMACS: A message-passing parallel molecular dynamics implementation. Comput. Phys. Commun. 1995, 91, 43–56. [Google Scholar] [CrossRef]
- Maier, J.A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K.E.; Simmerling, C. ff14SB: Improving the accuracy of protein side chain and backbone parameters from ff99SB. J. Chem. Theory Comput. 2015, 11, 3696–3713. [Google Scholar] [CrossRef] [PubMed]
- Parrinello, M.; Rahman, A. Polymorphic transitions in single crystals: A new molecular dynamics method. Int. J. Appl. Phys. 1981, 52, 7182–7190. [Google Scholar] [CrossRef]
- Hess, B.; Bekker, H.; Berendsen, H.J.C.; Fraaije, J.G.E.M. LINCS: A linear constraint solver for molecular simulations. J. Comput. Chem. 1997, 18, 1463–1472. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Genome Position (GRCh37) | Change | Location | DL-RP-MDS | ||
---|---|---|---|---|---|
Nucleotide | Amino Acid | P(U) | P(D) | ||
chr3:37061892 | c.976G>A | p.V326M | αI | 0.02 | 0.98 |
chr3:37050393 | c.542G>A | p.G181D | αF | 0.05 | 0.95 |
chr3:37061902 | c.986A>G | p.H329R | αI | 0.06 | 0.94 |
chr3:37053527 | c.614G>T | p.R205M | β8 | 0.06 | 0.94 |
chr3:37050317 | c.466T>C | p.F156L | 0.07 | 0.93 | |
chr3:37042467 | c.229T>G | p.C77G | 0.09 | 0.91 | |
chr3:37053333 | c.568A>G | p.I190V | 0.09 | 0.91 | |
chr3:37059069 | c.863C>G | p.T288R | 0.10 | 0.90 | |
chr3:37061811 | c.895A>G | p.S299G | 0.11 | 0.89 | |
chr3:37055988 | c.743A>G | p.N248S | 0.11 | 0.89 | |
chr3:37038126 | c.133A>T | p.T45S | 0.12 | 0.88 | |
chr3:37035074 | c.36C>G | p.D12E | αA | 0.12 | 0.88 |
chr3:37035064 | c.26G>A | p.R9Q | 0.12 | 0.88 | |
chr3:37055973 | c.728A>G | p.N243S | β10 | 0.13 | 0.87 |
chr3:37061810 | c.894C>G | p.I298M | β12 | 0.14 | 0.86 |
chr3:37059071 | c.865C>T | p.H289Y | 0.15 | 0.85 | |
chr3:37059009 | c.803A>C | p.E268A | 0.15 | 0.85 | |
chr3:37053325 | c.560A>G | p.N187S | αF | 0.15 | 0.85 |
chr3:37038156 | c.163G>A | p.G55S | 0.15 | 0.85 | |
chr3:37053523 | c.610G>A | p.V204I | β8 | 0.15 | 0.85 |
Methods | Deleterious Variants * | Total Deleterious (%) | Tolerate Variants ** | Total Tolerate (%) |
---|---|---|---|---|
MutationTaster | 440 | 98.4 | 7 | 1.6 |
fathmm_MKL_coding | 436 | 97.5 | 11 | 2.5 |
BayesDel_addAF | 431 | 96.4 | 16 | 3.6 |
M_CAP | 430 | 96.2 | 17 | 3.8 |
FATHMM | 426 | 95.3 | 21 | 4.7 |
BayesDel_noAF | 418 | 93.5 | 29 | 6.5 |
LIST_S2 | 414 | 92.6 | 33 | 7.4 |
ClinPred | 393 | 87.9 | 54 | 12.1 |
DEOGEN2 | 384 | 85.9 | 63 | 14.1 |
LRT | 381 | 85.2 | 66 | 14.8 |
REVEL | 380 | 85.0 | 67 | 15.0 |
fathmm_XF_coding | 358 | 80.1 | 89 | 19.9 |
MetaRNN | 356 | 79.6 | 91 | 20.4 |
SIFT | 316 | 70.7 | 131 | 29.3 |
MetaLR | 301 | 67.3 | 146 | 32.7 |
PROVEAN | 290 | 64.9 | 157 | 35.1 |
MutationAssessor | 285 | 63.8 | 162 | 36.2 |
SIFT4G | 282 | 63.1 | 165 | 36.9 |
MetaSVM | 281 | 62.9 | 166 | 37.1 |
Polyphen2_HDIV | 247 | 55.3 | 200 | 44.7 |
Polyphen2_HVAR | 213 | 47.7 | 234 | 52.3 |
DL-RP-MDS | 126 | 28.2 | 321 | 71.8 |
PrimateAI | 54 | 12.1 | 393 | 87.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tam, B.; Qin, Z.; Zhao, B.; Sinha, S.; Lei, C.L.; Wang, S.M. Classification of MLH1 Missense VUS Using Protein Structure-Based Deep Learning-Ramachandran Plot-Molecular Dynamics Simulations Method. Int. J. Mol. Sci. 2024, 25, 850. https://doi.org/10.3390/ijms25020850
Tam B, Qin Z, Zhao B, Sinha S, Lei CL, Wang SM. Classification of MLH1 Missense VUS Using Protein Structure-Based Deep Learning-Ramachandran Plot-Molecular Dynamics Simulations Method. International Journal of Molecular Sciences. 2024; 25(2):850. https://doi.org/10.3390/ijms25020850
Chicago/Turabian StyleTam, Benjamin, Zixin Qin, Bojin Zhao, Siddharth Sinha, Chon Lok Lei, and San Ming Wang. 2024. "Classification of MLH1 Missense VUS Using Protein Structure-Based Deep Learning-Ramachandran Plot-Molecular Dynamics Simulations Method" International Journal of Molecular Sciences 25, no. 2: 850. https://doi.org/10.3390/ijms25020850