The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024
Abstract
1. Introduction
2. Genome Mining for NPs
2.1. Genome Mining of RiPPs
2.2. Genome Mining of Terpenoids
2.3. Genome Mining of Polyketides
3. Metabolome Mining NPs
3.1. Metabolome Mining Based on MS
3.2. Metabolome Mining Based on NMR
4. Genomic and Metabolomic Guided the Isolation of NPs
4.1. Genomics Combined with Isotope Labeling and NMR
4.2. Genomics Combined with MS
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Atanasov, A.G.; Zotchev, S.B.; Dirsch, V.M.; Supuran, C.T.; the International Natural Product Sciences Taskforce. Natural products in drug discovery: Advances and opportunities. Nat. Rev. Drug Discov. 2021, 20, 200–216. [Google Scholar] [CrossRef] [PubMed]
- Newman, D.J.; Cragg, G.M. Natural products as sources of new drugs from 1981 to 2014. J. Nat. Prod. 2016, 79, 629–661. [Google Scholar] [CrossRef] [PubMed]
- Ligon, B.L. Penicillin: Its discovery and early development. Semin. Pediatr. Infect. Dis. 2004, 15, 52–57. [Google Scholar] [CrossRef]
- Tribe, H.T. The discovery and development of cyclosporin. Mycologist 1998, 12, 20–22. [Google Scholar] [CrossRef]
- Tobert, J.A. Lovastatin and beyond: The history of the HMG-CoA reductase inhibitors. Nat. Rev. Drug Discov. 2003, 2, 517–526. [Google Scholar] [CrossRef]
- Chu, F.; Bai, Z.; Zhu, H. Research progress of microbial natural products against drug-resistant bacteria. Nat. Prod. Res. Dev. 2015, 27, 1466–1482. [Google Scholar]
- Milshteyn, A.; Schneider, J.S.; Brady, S.F. Mining the metabiome: Identifying novel natural products from microbial communities. Chem. Biol. 2014, 21, 1211–1223. [Google Scholar] [CrossRef]
- Kalaitzis, J.A.; Ingrey, S.D.; Chau, R.; Simon, Y.; Neilan, B.A. Genome-guided discovery of natural products and biosynthetic pathways from Australia’s untapped microbial megadiversity. Aust. J. Chem. 2016, 69, 129–135. [Google Scholar] [CrossRef]
- Li, Z.; Zhu, D.; Shen, Y. Discovery of novel bioactive natural products driven by genome mining. Drug Discov. Ther. 2018, 12, 318–328. [Google Scholar] [CrossRef]
- Chi, H.; Liu, T. Synthetic biology promotes efficient production and innovative discovery of natural products. Chin. Bull. Life Sci. 2021, 33, 1510–1519. [Google Scholar]
- Sukmarini, L. Recent advances in discovery of lead structures from microbial natural products: Genomics- and metabolomics-guided acceleration. Molecules 2021, 26, 2542. [Google Scholar] [CrossRef]
- Wenger, A.M.; Peluso, P.; Rowell, W.J.; Chang, P.C.; Hall, R.J.; Concepcion, G.T.; Ebler, J.; Fungtammasan, A.; Kolesnikov, A.; Olson, N.D.; et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 2019, 37, 1155–1162. [Google Scholar] [CrossRef] [PubMed]
- Loman, N.; Goodwin, S.; Jansen, H.; Loose, M. A disruptive sequencer meets disruptive publishing. F1000Research 2015, 4, 1074. [Google Scholar] [CrossRef] [PubMed]
- Rigali, S.; Anderssen, S.; Naômé, A.; van Wezel, G.P. Cracking the regulatory code of biosynthetic gene clusters as a strategy for natural product discovery. Biochem. Pharmacol. 2018, 153, 24–34. [Google Scholar] [CrossRef] [PubMed]
- Kandiah, M.; Urban, P.L. Advances in ultrasensitive mass spectrometry of organic molecules. Chem. Soc. Rev. 2013, 42, 5299–5322. [Google Scholar] [CrossRef] [PubMed]
- Kovacs, H.; Moskau, D.; Spraul, M. Cryogenically cooled probes—A leap in NMR technology. Prog. Nucl. Magn. Reson. Spectrosc. 2005, 46, 131–155. [Google Scholar] [CrossRef]
- Habib, F.; Tocher, D.A.; Carmalt, C.J. Applications of the crystalline sponge method and developments of alternative crystalline sponges. Mater. Today Proc. 2022, 56, 3766–3773. [Google Scholar] [CrossRef]
- Li, J.; Liu, J.K.; Wang, W.X. GIAO 13C NMR calculation with sorted training sets improves accuracy and reliability for structural assignation. J. Org. Chem. 2020, 85, 11350–11358. [Google Scholar] [CrossRef] [PubMed]
- Pescitelli, G.; Di Bari, L.; Berova, N. Conformational aspects in the studies of organic compounds by electronic circular dichroism. Chem. Soc. Rev. 2011, 40, 4603–4625. [Google Scholar] [CrossRef]
- Sanger, F.; Air, G.M.; Barrell, B.G.; Brown, N.L.; Coulson, A.R.; Fiddes, C.A.; Hutchison, C.A.; Slocombe, P.M.; Smith, M. Nucleotide sequence of bacteriophage phi X174 DNA. Nature 1977, 265, 687–695. [Google Scholar] [CrossRef]
- Auslander, N.; Gussow, A.B.; Koonin, E.V. Incorporating machine learning into established bioinformatics frameworks. Int. J. Mol. Sci. 2021, 22, 2903. [Google Scholar] [CrossRef] [PubMed]
- Wei, S.; Wang, S. Structural stability-aware deep learning: Advancing RNA secondary structure prediction. In Proceedings of the 2024 Fourth International Conference on Biomedicine and Bioinformatics Engineering, Kaifeng, China, 14–16 June 2024. 132521G. [Google Scholar]
- Liu, Z.; Su, L.; Fang, X.; Chang, D.; Chen, Z.; Jiang, X.; Li, T.; Wang, Y.; Guo, Y.; Wang, J. A Spatial Strain of Staphylococcus aureus LCT-SA67. CN103087943B, 4 March 2015. [Google Scholar]
- Guo, Y.; Chang, D.; Chen, Z.; Wang, Y.; Su, L.; Wang, L.; Wang, J.; Liu, Z.; Li, T.; Fang, X. A Spatial Strain of Klebsiella pneumoniae LCT-KP289. CN102994414B, 12 November 2014. [Google Scholar]
- Bao, Z.; Hu, J.; Zhang, L.; Bao, L.; Yu, H.; Li, Y.; Wang, S. Integrating Micro- and Macro-scale Comparative Genomics Analysis Methods. CN117976041A, 3 May 2024. [Google Scholar]
- Blin, K.; Shaw, S.; Augustijn, H.E.; Reitz, Z.L.; Biermann, F.; Alanjary, M.; Fetter, A.; Terlouw, B.R.; Metcalf, W.W.; Helfrich, E.J.N.; et al. antiSMASH 7.0: New and improved predictions for detection, regulation, chemical structures and visualisation. Nucleic Acids Res. 2023, 51, W46–W50. [Google Scholar] [CrossRef]
- Hannigan, G.D.; Prihoda, D.; Palicka, A.; Soukup, J.; Klempir, O.; Rampula, L.; Durcak, J.; Wurst, M.; Kotowski, J.; Chang, D.; et al. A deep learning genome-mining strategy for biosynthetic gene cluster prediction. Nucleic Acids Res. 2019, 47, e110. [Google Scholar] [CrossRef]
- Liu, M.Y.; Li, Y.; Li, H.Z. Deep learning to predict the biosynthetic gene clusters in bacterial genomes. J. Mol. Biol. 2022, 434, 167597. [Google Scholar] [CrossRef] [PubMed]
- Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The performance of LSTM and BiLSTM in forecasting time series. In Proceedings of the IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; pp. 3285–3292. [Google Scholar]
- Luo, L.; Yang, Z.H.; Yang, P.; Zhang, Y.; Wang, L.; Lin, H.F.; Wang, J. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition. Bioinformatics 2018, 34, 1381–1388. [Google Scholar] [CrossRef]
- Ren, Q.; Cheng, H.; Han, H. Research on machine learning framework based on random forest algorithm. In Proceedings of the International Conference on Advances in Materials, Machinery, Electronics (AMME), Wuhan, China, 25–26 February 2017. [Google Scholar]
- Schonlau, M.; Zou, R.Y. The random forest algorithm for statistical learning. Stata J. 2020, 20, 3–29. [Google Scholar] [CrossRef]
- Anker, A.S.; Friis-Jensen, U.; Johansen, F.L.; Billinge, S.J.L.; Jensen, K.M.Ø. ClusterFinder: A fast tool to find cluster structures from pair distribution function data. Acta Crystallogr. A-Found. Adv. 2024, 80, 213–220. [Google Scholar] [CrossRef]
- McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef] [PubMed]
- van Dijk, E.L.; Auger, H.; Jaszczyszyn, Y.; Thermes, C. Ten years of next-generation sequencing technology. Trends Genet. 2014, 30, 418–426. [Google Scholar] [CrossRef]
- Heidersbach, A.J.; Dorighi, K.M.; Gomez, J.A.; Jacobi, A.M.; Haley, B. A versatile, high-efficiency platform for CRISPR-based gene activation. Nat. Commun. 2023, 14, 902. [Google Scholar] [CrossRef] [PubMed]
- DeCarlo, P.F.; Kimmel, J.R.; Trimborn, A.; Northway, M.J.; Jayne, J.T.; Aiken, A.C.; Gonin, M.; Fuhrer, K.; Horvath, T.; Docherty, K.S.; et al. Field-deployable, high-resolution, time-of-flight aerosol mass spectrometer. Anal. Chem. 2006, 78, 8281–8289. [Google Scholar] [CrossRef] [PubMed]
- Marshall, A.G.; Hendrickson, C.L.; Jackson, G.S. Fourier transform ion cyclotron resonance mass spectrometry: A primer. Mass Spectrom. Rev. 1998, 17, 1–35. [Google Scholar] [CrossRef]
- Olsen, J.V.; de Godoy, L.M.F.; Li, G.Q.; Macek, B.; Mortensen, P.; Pesch, R.; Makarov, A.; Lange, O.; Horning, S.; Mann, M. Parts per million mass accuracy on an orbitrap mass spectrometer via lock mass injection into a C-trap. Mol. Cell. Proteom. 2005, 4, 2010–2021. [Google Scholar] [CrossRef]
- Comisarow, M.B.; Marshall, A.G. Fourier transform ion cyclotron resonance spectroscopy. Chem. Phys. Lett. 1974, 25, 282–283. [Google Scholar] [CrossRef]
- Dumez, J.N. NMR methods for the analysis of mixtures. Chem. Commun. 2022, 58, 13855–13872. [Google Scholar] [CrossRef]
- Shapiro, M.J.; Wareing, J.R. NMR methods in combinatorial chemistry. Curr. Opin. Chem. Biol. 1998, 2, 372–375. [Google Scholar] [CrossRef]
- Lhoste, C.; Lorandel, B.; Praud, C.; Marchand, A.; Mishra, R.; Dey, A.; Bernard, A.; Dumez, J.N.; Giraudeau, P. Ultrafast 2D NMR for the analysis of complex mixtures. Prog. Nucl. Magn. Reson. Spectrosc. 2022, 130, 1–46. [Google Scholar] [CrossRef] [PubMed]
- Hansen, A.L.; Kupce, E.; Li, D.W.; Bruschweiler-Li, L.; Wang, C.; Brüschweiler, R. 2D NMR-based metabolomics with HSQC/TOCSY NOAH supersequences. Anal. Chem. 2021, 93, 6112–6119. [Google Scholar] [CrossRef]
- Wang, M.X.; Carver, J.J.; Phelan, V.V.; Sanchez, L.M.; Garg, N.; Peng, Y.; Nguyen, D.D.; Watrous, J.; Kapono, C.A.; Luzzatto-Knaan, T.; et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 2016, 34, 828–837. [Google Scholar] [CrossRef] [PubMed]
- Nothias, L.F.; Petras, D.; Schmid, R.; Dührkop, K.; Rainer, J.; Sarvepalli, A.; Protsyuk, I.; Ernst, M.; Tsugawa, H.; Fleischauer, M.; et al. Feature-based molecular networking in the GNPS analysis environment. Nat. Methods 2020, 17, 905–908. [Google Scholar] [CrossRef]
- Duhrkop, K.; Fleischauer, M.; Ludwig, M.; Aksenov, A.A.; Melnik, A.V.; Meusel, M.; Dorrestein, P.C.; Rousu, J.; Bocker, S. SIRIUS 4: A rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 2019, 16, 299–302. [Google Scholar] [CrossRef] [PubMed]
- Deng, Y.J.; Yao, Y.; Wang, Y.N.; Yu, T.T.; Cai, W.H.; Zhou, D.L.; Yin, F.; Liu, W.L.; Liu, Y.Y.; Xie, C.B.; et al. An end-to-end deep learning method for mass spectrometry data analysis to reveal disease-specific metabolic profiles. Nat. Commun. 2024, 15, 7136. [Google Scholar] [CrossRef] [PubMed]
- Eckert-Boulet, N.; Nielsen, P.S.; Friis, C.; dos Santos, M.M.; Nielsen, J.; Kielland-Brandt, M.C.; Regenberg, B. Transcriptional profiling of extracellular amino acid sensing in Saccharomyces cerevisiae and the role of Stp1p and Stp2p. Yeast 2004, 21, 635–648. [Google Scholar] [CrossRef] [PubMed]
- Ishii, N.; Nakahigashi, K.; Baba, T.; Robert, M.; Soga, T.; Kanai, A.; Hirasawa, T.; Naba, M.; Hirai, K.; Hoque, A.; et al. Multiple high-throughput analyses monitor the response of E.coli to perturbations. Science 2007, 316, 593–597. [Google Scholar] [CrossRef]
- Huo, L.J.; Hug, J.J.; Fu, C.Z.; Bian, X.Y.; Zhang, Y.M.; Müller, R. Heterologous expression of bacterial natural product biosynthetic pathways. Nat. Prod. Rep. 2019, 36, 1412–1436. [Google Scholar] [CrossRef]
- Romano, S.; Jackson, S.A.; Patry, S.; Dobson, A.D.W. Extending the “one strain many compounds” (OSMAC) principle to marine microorganisms. Mar. Drugs 2018, 16, 244. [Google Scholar] [CrossRef]
- Si, Y.; Feng, Q. Application of OSMAC strategy in the study of microbial secondary metabolites. J. Shenyang Pharm. Univ. 2023, 40, 370–380. [Google Scholar]
- Deng, Q.S.; Li, Y.C.; He, W.Y.; Chen, T.; Liu, N.; Ma, L.M.; Qiu, Z.X.; Shang, Z.; Wang, Z.Q. A polyene macrolide targeting phospholipids in the fungal cell membrane. Nature 2025, 640, 743–751. [Google Scholar] [CrossRef]
- Montalbán-López, M.; Scott, T.A.; Ramesh, S.; Rahman, I.R.; van Heel, A.J.; Viel, J.H.; Bandarian, V.; Dittmann, E.; Genilloud, O.; Goto, Y.; et al. New developments in RiPP discovery, enzymology and engineering. Nat. Prod. Rep. 2021, 38, 130–239. [Google Scholar] [CrossRef]
- Liu, J.; Liu, R.; He, B.-B.; Lin, X.; Guo, L.; Wu, G.; Li, Y.-X. Bacterial cytochrome P450 catalyzed macrocyclization of ribosomal peptides. ACS Bio Med Chem Au 2024, 4, 268–279. [Google Scholar] [CrossRef]
- Kunakom, S.; Otani, H.; Udwary, D.W.; Doering, D.T.; Mouncey, N.J. Cytochromes P450 involved in bacterial RiPP biosyntheses. J. Ind. Microbiol. Biotechnol. 2023, 50, 2023. [Google Scholar] [CrossRef] [PubMed]
- Zhong, G. Cytochromes P450 associated with the biosyntheses of ribosomally synthesized and post-translationally modified peptides. ACS Bio Med Chem Au 2023, 3, 371–388. [Google Scholar] [CrossRef]
- Laws, D., III; Plouch, E.V.; Blakey, S.B. Synthesis of ribosomally synthesized and post-translationally modified peptides containing C-C cross-links. J. Nat. Prod. 2022, 85, 2519–2539. [Google Scholar] [CrossRef] [PubMed]
- Zhu, W.S.; Shenoy, A.; Kundrotas, P.; Elofsson, A. Evaluation of alphafold-multimer prediction on multi-chain protein complexes. Bioinformatics 2023, 39, btad424. [Google Scholar] [CrossRef]
- He, B.B.; Liu, J.; Cheng, Z.; Liu, R.Z.; Zhong, Z.; Gao, Y.; Liu, H.Y.; Song, Z.M.; Tian, Y.Q.; Li, Y.X. Bacterial cytochrome P450 catalyzed post-translational macrocyclization of ribosomal peptides. Angew. Chem.-Int. Ed. 2023, 62, e202311533. [Google Scholar] [CrossRef]
- Gerlt, J.A.; Bouvier, J.T.; Davidson, D.B.; Imker, H.J.; Sadkhin, B.; Slater, D.R.; Whalen, K.L. Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks. Biochim. Biophys. Acta-Proteins Proteom. 2015, 1854, 1019–1037. [Google Scholar] [CrossRef]
- Moffat, A.D.; Santos-Aberturas, J.; Chandra, G.; Truman, A.W. A user guide for the identification of new RiPP biosynthetic gene clusters using a RiPPER-based workflow. In Antimicrobial Therapies: Methods and Protocols; Barreiro, C., Barredo, J.L., Eds.; Springer: Berlin/Heidelberg, Germany, 2021; Volume 2296, pp. 227–247. [Google Scholar]
- Santos-Aberturas, J.; Chandra, G.; Frattaruolo, L.; Lacret, R.; Pham, T.H.; Vior, N.M.; Eyles, T.H.; Truman, A.W. Uncovering the unexplored diversity of thioamidated ribosomal peptides in Actinobacteria using the RiPPER genome mining tool. Nucleic Acids Res. 2019, 47, 4624–4637. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.L.; Wang, Z.J.; Shi, J.; Yan, Z.Y.; Zhang, G.D.; Jiao, R.H.; Tan, R.X.; Ge, H.M. P450-modified multicyclic cyclophane-containing ribosomally synthesized and post-translationally modified peptides. Angew. Chem.-Int. Ed. 2024, 63, e202314046. [Google Scholar] [CrossRef]
- Altschul, S.; Madden, T.; Schaffer, A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. FASEB J. 1998, 12, A1326. [Google Scholar] [CrossRef] [PubMed]
- Tietz, J.I.; Schwalen, C.J.; Patel, P.S.; Maxson, T.; Blair, P.M.; Tai, H.C.; Zakai, U.I.; Mitchell, D.A. A new genome-mining tool redefines the lasso peptide biosynthetic landscape. Nat. Chem. Biol. 2017, 13, 470–478. [Google Scholar] [CrossRef]
- Nam, H.; An, J.S.; Lee, J.; Yun, Y.; Lee, H.; Park, H.; Jung, Y.; Oh, K.B.; Oh, D.C.; Kim, S. Exploring the diverse landscape of biaryl-containing peptides generated by cytochrome P450 macrocyclases. J. Am. Chem. Soc. 2023, 145, 22047–22057. [Google Scholar] [CrossRef] [PubMed]
- Wenzel, S.C.; Müller, R. Recent developments towards the heterologous expression of complex bacterial natural product biosynthetic pathways. Curr. Opin. Biotechnol. 2005, 16, 594–606. [Google Scholar] [CrossRef] [PubMed]
- Gomez-Escribano, J.P.; Bibb, M.J. Heterologous expression of natural product biosynthetic gene clusters in Streptomyces coelicolor: From genome mining to manipulation of biosynthetic pathways. J. Ind. Microbiol. Biotechnol. 2014, 41, 425–431. [Google Scholar] [CrossRef]
- Caldwell, B.J.; Bell, C.E. Structure and mechanism of the red recombination system of bacteriophage λ. Prog. Biophys. Mol. Biol. 2019, 147, 33–46. [Google Scholar] [CrossRef]
- Murphy, K.C. λ recombination and recombineering. EcoSal Plus 2016, 7. [Google Scholar] [CrossRef] [PubMed]
- Kawahara, T.; Izumikawa, M.; Kozone, I.; Hashimoto, J.; Kagaya, N.; Koiwai, H.; Komatsu, M.; Fujie, M.; Sato, N.; Ikeda, H.; et al. Neothioviridamide, a polythioamide compound produced by heterologous expression of a Streptomyces sp. Cryptic RiPP biosynthetic gene cluster. J. Nat. Prod. 2018, 81, 264–269. [Google Scholar] [CrossRef]
- Guo, M.X.; Zhang, M.M.; Sun, K.; Cui, J.J.; Liu, Y.C.; Gao, K.; Dong, S.H.; Luo, S.W. Genome mining of linaridins provides insights into the widely distributed LinC oxidoreductases. J. Nat. Prod. 2023, 86, 2333–2341. [Google Scholar] [CrossRef]
- Parson, W.; Strobl, C.; Huber, G.; Zimmermann, B.; Gomes, S.M.; Souto, L.; Fendt, L.; Delport, R.; Langit, R.; Wootton, S.; et al. Evaluation of next generation mtGenome sequencing using the Ion Torrent Personal Genome Machine (PGM). Forensic Sci. Int.-Genet. 2013, 7, 543–549. [Google Scholar] [CrossRef]
- Aziz, R.K.; Bartels, D.; Best, A.A.; DeJongh, M.; Disz, T.; Edwards, R.A.; Formsma, K.; Gerdes, S.; Glass, E.M.; Kubal, M.; et al. The RAST server: Rapid annotations using subsystems technology. BMC Genom. 2008, 9, 75. [Google Scholar] [CrossRef]
- Cortés-Albayay, C.; Jarmusch, S.A.; Trusch, F.; Ebel, R.; Andrews, B.A.; Jaspars, M.; Asenjo, J.A. Downsizing class II lasso peptides: Genome mining-guided isolation of huascopeptin containing the first gly1-asp7 macrocycle. J. Org. Chem. 2020, 85, 1661–1667. [Google Scholar] [CrossRef]
- Lei, R.; Tao, H.; Liu, T. Deep genome mining boosts the discovery of microbial terpenoids. Synth. Biol. J. 2024, 5, 507–526. [Google Scholar]
- Chen, R.; Jia, Q.D.; Mu, X.; Hu, B.; Sun, X.; Deng, Z.X.; Chen, F.; Bian, G.K.; Liu, T.G. Systematic mining of fungal chimeric terpene synthases using an efficient precursor-providing yeast chassis. Proc. Natl. Acad. Sci. USA 2021, 118, e2023247118. [Google Scholar] [CrossRef] [PubMed]
- He, H.B.; Bian, G.K.; Herbst-Gervasoni, C.J.; Mori, T.; Shinsky, S.A.; Hou, A.W.; Mu, X.; Huang, M.J.; Cheng, S.; Deng, Z.X.; et al. Discovery of the cryptic function of terpene cyclases as aromatic prenyltransferases. Nat. Commun. 2020, 11, 3958. [Google Scholar] [CrossRef]
- Chen, C.C.; Malwal, S.R.; Han, X.; Liu, W.D.; Ma, L.X.; Zhai, C.; Dai, L.H.; Huang, J.W.; Shillo, A.; Desai, J.; et al. Terpene cyclases and prenyltransferases: Structures and mechanisms of action. ACS Catal. 2021, 11, 290–303. [Google Scholar] [CrossRef]
- Yu, D.S.; Lee, D.H.; Kim, S.K.; Lee, C.H.; Song, J.Y.; Kong, E.B.; Kim, J.F. Algorithm for predicting functionally equivalent proteins from BLAST and HMMER searches. J. Microbiol. Biotechnol. 2012, 22, 1054–1058. [Google Scholar] [CrossRef] [PubMed]
- Hubley, R.; Finn, R.D.; Clements, J.; Eddy, S.R.; Jones, T.A.; Bao, W.D.; Smit, A.F.A.; Wheelers, T.J. The Dfam database of repetitive DNA families. Nucleic Acids Res. 2016, 44, D81–D89. [Google Scholar] [CrossRef] [PubMed]
- Sayers, E.W.; Beck, J.; Bolton, E.E.; Brister, J.R.; Chan, J.; Comeau, D.C.; Connor, R.; DiCuccio, M.; Farrell, C.M.; Feldgarden, M.; et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2024, 52, D33–D43. [Google Scholar] [CrossRef]
- Brown, G.R.; Hem, V.; Katz, K.S.; Ovetsky, M.; Wallin, C.; Ermolaeva, O.; Tolstoy, I.; Tatusova, T.; Pruitt, K.D.; Maglott, D.R.; et al. Gene: A gene-centered information resource at NCBI. Nucleic Acids Res. 2015, 43, D36–D42. [Google Scholar] [CrossRef]
- Bateman, A.; Martin, M.J.; O’Donovan, C.; Magrane, M.; Apweiler, R.; Alpi, E.; Antunes, R.; Ar-Ganiska, J.; Bely, B.; Bingley, M.; et al. UniProt: A hub for protein information. Nucleic Acids Res. 2015, 43, D204–D212. [Google Scholar]
- Bateman, A.; Martin, M.J.; Orchard, S.; Magrane, M.; Adesina, A.; Ahmad, S.; Bowler-Barnett, E.H.; Bye-A-Jee, H.; Carpentier, D.; Denny, P.; et al. UniProt: The Universal Protein Knowledgebase in 2025. Nucleic Acids Res. 2024, 52, D609–D617. [Google Scholar]
- Tang, J.; Matsuda, Y. Discovery of fungal onoceroid triterpenoids through domainless enzyme-targeted global genome mining. Nat. Commun. 2024, 15, 4312. [Google Scholar] [CrossRef] [PubMed]
- Racolta, S.; Juhl, P.B.; Sirim, D.; Pleiss, J. The triterpene cyclase protein family: A systematic analysis. Proteins-Struct. Funct. Bioinform. 2012, 80, 2009–2019. [Google Scholar] [CrossRef]
- Chen, R.; Feng, T.; Li, M.; Zhang, X.Y.; He, J.; Hu, B.; Deng, Z.X.; Liu, T.A.; Liu, J.K.; Wang, X.H.; et al. Characterization of tremulane sesquiterpene synthase from the basidiomycete Irpex lacteus. Org. Lett. 2022, 24, 5669–5673. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Jiang, Y.Y.; Zhang, X.W.; Chang, Y.M.; Li, S.; Zhang, X.M.; Zheng, S.M.; Geng, C.; Men, P.; Ma, L.; et al. Fragrant venezuelaenes A and B with A 5-5-6-7 tetracyclic skeleton: Discovery, biosynthesis, and mechanisms of central catalysts. ACS Catal. 2020, 10, 5846–5851. [Google Scholar] [CrossRef]
- Zhang, P.; Wu, G.W.; Heard, S.C.; Niu, C.S.; Bell, S.A.; Li, F.L.; Ye, Y.; Zhang, Y.H.; Winter, J.M. Identification and characterization of a cryptic bifunctional type I diterpene synthase involved in talaronoid biosynthesis from a marine-derived fungus. Org. Lett. 2022, 24, 7037–7041. [Google Scholar] [CrossRef] [PubMed]
- Sun, X.; Cai, Y.S.; Yuan, Y.J.; Bian, G.K.; Ye, Z.L.; Deng, Z.X.; Liu, T.G. Genome mining in Trichoderma viride J1-030: Discovery and identification of novel sesquiterpene synthase and its products. Beilstein J. Org. Chem. 2019, 15, 2052–2058. [Google Scholar] [CrossRef]
- Li, W.Z.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006, 22, 1658–1659. [Google Scholar] [CrossRef]
- Fu, L.M.; Niu, B.F.; Zhu, Z.W.; Wu, S.T.; Li, W.Z. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 2012, 28, 3150–3152. [Google Scholar] [CrossRef]
- Navarro-Muñoz, J.C.; Selem-Mojica, N.; Mullowney, M.W.; Kautsar, S.A.; Tryon, J.H.; Parkinson, E.; De Los Santos, E.L.C.; Yeong, M.; Cruz-Morales, P.; Abubucker, S.; et al. A computational framework to explore large-scale biosynthetic diversity. Nat. Chem. Biol. 2020, 16, 60–68. [Google Scholar] [CrossRef]
- Liu, W.C.; Tian, X.Y.; Huang, X.; Malit, J.J.L.; Wu, C.H.; Guo, Z.H.; Tang, J.W.; Qian, P.Y. Discovery of P450-modified sesquiterpenoids levinoids A-D through global genome mining. J. Nat. Prod. 2024, 87, 876–883. [Google Scholar] [CrossRef]
- Guo, J.J.; Cai, Y.S.; Cheng, F.C.; Yang, C.J.; Zhang, W.Q.; Yu, W.L.; Yan, J.J.; Deng, Z.X.; Hong, K. Genome mining reveals a multiproduct sesterterpenoid biosynthetic gene cluster in Aspergillus ustus. Org. Lett. 2021, 23, 1525–1529. [Google Scholar] [CrossRef] [PubMed]
- Yan, D.H.; Zhou, M.Q.; Adduri, A.; Zhuang, Y.H.; Guler, M.; Liu, S.T.; Shin, H.; Kovach, T.; Oh, G.; Liu, X.; et al. Discovering type I cis-AT polyketides through computational mass spectrometry and genome mining with Seq2PKS. Nat. Commun. 2024, 15, 5356. [Google Scholar] [CrossRef] [PubMed]
- Liaw, Y.C. Improvement of the fast exact pairwise-nearest-neighbor algorithm. Pattern Recognit. 2009, 42, 867–870. [Google Scholar] [CrossRef]
- Mohimani, H.; Gurevich, A.; Shlemov, A.; Mikheenko, A.; Korobeynikov, A.; Cao, L.; Shcherbin, E.; Nothias, L.F.; Dorrestein, P.C.; Pevzner, P.A. Dereplication of microbial metabolites through database search of mass spectra. Nat. Commun. 2018, 9, 4035. [Google Scholar] [CrossRef]
- Zhao, L.Y.; Shi, J.; Xu, Z.Y.; Sun, J.L.; Yan, Z.Y.; Tong, Z.W.; Tan, R.X.; Jiao, R.H.; Ge, H.M. Hybrid type I and II polyketide synthases yield distinct aromatic polyketides. J. Am. Chem. Soc. 2024, 146, 29462–29468. [Google Scholar] [CrossRef] [PubMed]
- Dong, J.Y.; Tang, M.C.; Liu, L. α-pyrone derivatives from Calcarisporium arbuscula discovered by genome mining. J. Nat. Prod. 2023, 86, 2496–2501. [Google Scholar] [CrossRef]
- Khan, H.; Ali, J. UHPLC/Q-TOF-MS Technique: Introduction and Applications. Lett. Org. Chem. 2015, 12, 371–378. [Google Scholar] [CrossRef]
- Alsaleh, M.; Barbera, T.A.; Andrews, R.H.; Sithithaworn, P.; Khuntikeo, N.; Loilome, W.; Yongvanit, P.; Cox, I.J.; Syms, R.R.A.; Holmes, E.; et al. Mass spectrometry: A guide for the clinician. J. Clin. Exp. Hepatol. 2019, 9, 597–606. [Google Scholar] [CrossRef]
- Liu, J.Z.; Wang, Y.D.; Fang, H.Q.; Sun, G.B.; Ding, G. UPLC-Q-TOF-MS/MS-based targeted discovery of chetomin analogues from Chaetomium cochliodes. J. Nat. Prod. 2024, 87, 1660–1665. [Google Scholar] [CrossRef]
- Hu, Y.W.; Ma, S.; Pang, X.Y.; Cong, M.J.; Liu, Q.Q.; Han, F.H.; Wang, J.J.; Feng, W.E.; Liu, Y.H.; Wang, J.F. Cytotoxic pyridine alkaloids from a marine-derived fungus Arthrinium arundinis exhibiting apoptosis-inducing activities against small cell lung cancer. Phytochemistry 2023, 213, 113765. [Google Scholar] [CrossRef]
- Chen, C.M.; Chen, W.H.; Tao, H.M.; Yang, B.; Zhou, X.F.; Luo, X.W.; Liu, Y.H. Diversified polyketides and nitrogenous compounds from the mangrove endophytic fungus Penicillium steckii SCSIO 41025. Chin. J. Chem. 2021, 39, 2132–2140. [Google Scholar] [CrossRef]
- Guo, H.J.; Daniel, J.M.; Seibel, E.; Burkhardt, I.; Conlon, B.H.; Görls, H.; Vassao, D.G.; Dickschat, J.S.; Poulsen, M.; Beemelmanns, C. Insights into the metabolomic capacity of Podaxis and isolation of podaxisterols A-D, ergosterol derivatives carrying nitrosyl cyanide-derived modifications. J. Nat. Prod. 2022, 85, 2159–2167. [Google Scholar] [CrossRef]
- Marner, M.; Patras, M.A.; Kurz, M.; Zubeil, F.; Förster, F.; Schuler, S.; Bauer, A.; Hammann, P.; Vilcinskas, A.; Schäberle, T.F.; et al. Molecular networking-guided discovery and characterization of stechlisins, a group of cyclic lipopeptides from a Pseudomonas sp. J. Nat. Prod. 2020, 83, 2607–2617. [Google Scholar] [CrossRef] [PubMed]
- Um, S.; Seibel, E.; Schalk, F.; Balluff, S.; Beemelmanns, C. Targeted isolation of saalfelduracin B-D from Amycolatopsis saalfeldensis using LC-MS/MS-based molecular networking. J. Nat. Prod. 2021, 84, 1002–1011. [Google Scholar] [CrossRef] [PubMed]
- Guo, J.; Huan, T. Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography-mass spectrometry based untargeted metabolomics. Anal. Chem. 2020, 92, 8072–8080. [Google Scholar] [CrossRef]
- Yu, F.C.; Teo, G.C.; Kong, A.T.; Fröhlich, K.; Li, G.X.; Demichev, V.; Nesvizhskii, A.I. Analysis of DIA proteomics data using MSFragger-DIA and FragPipe computational platform. Nat. Commun. 2023, 14, 4154. [Google Scholar] [CrossRef]
- Chang, S.S.; Li, Y.H.; Huang, X.Y.; He, N.; Wang, M.Y.; Wang, J.H.; Luo, M.N.; Li, Y.; Xie, Y.Y. Bioactivity-based molecular networking-guided isolation of epicolidines A-C from the endophytic fungus Epicoccum sp. 1-042. J. Nat. Prod. 2024, 87, 1582–1590. [Google Scholar] [CrossRef]
- Damiani, T.; Jarmusch, A.K.; Aron, A.T.; Petras, D.; Phelan, V.V.; Zhao, H.N.; Bittremieux, W.; Acharya, D.D.; Ahmed, M.M.A.; Bauermeister, A.; et al. A universal language for finding mass spectrometry data patterns. Nat. Methods 2025, 22, 1247–1254. [Google Scholar] [CrossRef]
- Berger, T.; Alenfelder, J.; Steinmüller, S.; Heimann, D.; Gohain, N.; Petras, D.; Wang, M.X.; Berger, R.; Kostenis, E.; Reher, R. A massQL-integrated molecular networking approach for the discovery and substructure annotation of bioactive cyclic peptides. J. Nat. Prod. 2024, 87, 692–704. [Google Scholar] [CrossRef]
- Hou, X.M.; Li, Y.Y.; Shi, Y.W.; Fang, Y.W.; Chao, R.; Gu, Y.C.; Wang, C.Y.; Shao, C.L. Integrating molecular networking and 1H NMR to target the isolation of chrysogeamides from a library of marine-derived Penicillium fungi. J. Org. Chem. 2019, 84, 1228–1237. [Google Scholar] [CrossRef]
- Flores-Bocanegra, L.; Al Subeh, Z.Y.; Egan, J.M.; El-Elimat, T.; Raja, H.A.; Burdette, J.E.; Pearce, C.J.; Linington, R.G.; Oberlies, N.H. Dereplication of fungal metabolites by NMR-based compound networking using MADByTE. J. Nat. Prod. 2022, 85, 614–624. [Google Scholar] [CrossRef]
- Borges, R.M.; Ferreira, G.D.; Campos, M.M.; Teixeira, A.M.; Costa, F.D.; das Chagas, F.O.; Colonna, M. NMR as a tool for compound identification in mixtures. Phytochem. Anal. 2023, 34, 385–392. [Google Scholar] [CrossRef]
- Egan, J.M.; van Santen, J.A.; Liu, D.Y.; Linington, R.G. Development of an NMR-based platform for the direct structural annotation of complex natural products mixtures. J. Nat. Prod. 2021, 84, 1044–1055. [Google Scholar] [CrossRef]
- Agrawal, P.; Khater, S.; Gupta, M.; Sain, N.; Mohanty, D. RiPPMiner: A bioinformatics resource for deciphering chemical structures of RiPPs based on prediction of cleavage and cross-links. Nucleic Acids Res. 2017, 45, W80–W88. [Google Scholar] [CrossRef]
- Agrawal, P.; Amir, S.; Deepak; Barua, D.; Mohanty, D. RiPPMiner-Genome: A web resource for automated prediction of crosslinked chemical structures of RiPPs by genome mining. J. Mol. Biol. 2021, 433, 166887. [Google Scholar] [CrossRef]
- Mahmud, T. Isotope tracer investigations of natural products biosynthesis: The discovery of novel metabolic pathways. J. Label. Compd. Radiopharm. 2007, 50, 1039–1051. [Google Scholar] [CrossRef]
- Saad, H.; Aziz, S.; Gehringer, M.; Kramer, M.; Straetener, J.; Berscheid, A.; Brötz-Oesterhelt, H.; Gross, H. Nocathioamides, Uncovered by a tunable metabologenomic approach, define a novel class of chimeric lanthipeptides. Angew. Chem.-Int. Ed. 2021, 60, 16472–16479. [Google Scholar] [CrossRef]
- Reher, R.; Kim, H.W.; Zhang, C.; Mao, H.H.; Wang, M.X.; Nothias, L.F.; Caraballo-Rodriguez, A.M.; Glukhov, E.; Teke, B.; Leao, T.; et al. A convolutional neural network-based approach for the rapid annotation of molecularly diverse natural products. J. Am. Chem. Soc. 2020, 142, 4114–4120. [Google Scholar] [CrossRef]
- Sun, Z.L.; Wu, M.Y.; Zhong, B.Y.; Wu, J.S.; Liu, D.; Ren, J.W.; Fan, S.L.; Lin, W.H.; Fan, A.L. Target discovery of dhilirane-type meroterpenoids by biosynthesis guidance and tailoring enzyme catalysis. J. Am. Chem. Soc. 2024, 146, 30242–30251. [Google Scholar] [CrossRef]
- Shin, D.; Byun, W.S.; Kang, S.; Kang, I.; Bae, E.S.; An, J.S.; Im, J.H.; Park, J.; Kim, E.; Ko, K.; et al. Targeted and logical discovery of piperazic acid-bearing natural products based on genomic and spectroscopic signatures. J. Am. Chem. Soc. 2023, 145, 19676–19690. [Google Scholar] [CrossRef]
- Morgan, K.D.; Williams, D.E.; Patrick, B.O.; Remigy, M.; Banuelos, C.A.; Sadar, M.D.; Ryan, K.S.; Andersen, R.J. Incarnatapeptins A and B, nonribosomal peptides discovered using genome mining and 1H 15N HSQC-TOCSY. Org. Lett. 2020, 22, 4053–4057. [Google Scholar] [CrossRef]
- Huang, H.M.; Yue, L.G.; Deng, F.Y.; Wang, X.Y.; Wang, N.; Chen, H.; Li, H.Y. NMR-metabolomic profiling and genome mining drive the discovery of cyclic decapeptides from a marine Streptomyces. J. Nat. Prod. 2023, 86, 2122–2130. [Google Scholar] [CrossRef]
- Park, J.; Shin, Y.H.; Hwang, S.; Kim, J.; Moon, D.H.; Kang, I.; Ko, Y.J.; Chung, B.; Nam, H.; Kim, S.; et al. Discovery of terminal oxazole-bearing natural products by a targeted metabologenomic approach. Angew. Chem.-Int. Ed. 2024, 63, e202402465. [Google Scholar] [CrossRef]
- Cox, C.L.; Tietz, J.I.; Melby, J.O.; Sokolowski, K.; Doroghazi, J.R.; Mitchell, D.A. Nucleophilic 1,4-additions for natural product discovery. Abstr. Gen. Meet. Am. Soc. Microbiol. 2014, 114, 2438. [Google Scholar] [CrossRef] [PubMed]
- McCaughey, C.S.; van Santen, J.A.; van der Hooft, J.J.J.; Medema, M.H.; Linington, R.G. An isotopic labeling approach linking natural products with biosynthetic gene clusters. Nat. Chem. Biol. 2022, 18, 295–304. [Google Scholar] [CrossRef] [PubMed]
- Fergusson, C.H.; Saulog, J.; Paulo, B.S.; Wilson, D.M.; Liu, D.Y.; Morehouse, N.J.; Waterworth, S.; Barkei, J.; Gray, C.A.; Kwan, J.C.; et al. Discovery of a lagriamide polyketide by integrated genome mining, isotopic labeling, and untargeted metabolomics. Chem. Sci. 2024, 15, 8089–8096. [Google Scholar] [CrossRef]
- Chen, D.W.; Song, Z.J.; Han, J.J.; Liu, J.M.; Liu, H.W.; Dai, J.G. Targeted discovery of glycosylated natural products by tailoring enzyme-guided genome mining and MS-based metabolome analysis. J. Am. Chem. Soc. 2024, 146, 9614–9622. [Google Scholar] [CrossRef]
- Lee, J.; Um, S.; Kim, E.H.; Kim, S.H. Genomic and metabolomic analyses of Nocardiopsis maritima YSL2 as the mycorrhizosphere bacterium of Suaeda maritima (L.) Dumort. J. Nat. Prod. 2024, 87, 733–742. [Google Scholar] [CrossRef]
- Ahmed, M.M.A.; Boudreau, P.D. LCMS-metabolomic profiling and genome mining of Delftia lacustris DSM 21246 revealed lipophilic delftibactin metallophores. J. Nat. Prod. 2024, 87, 1384–1393. [Google Scholar] [CrossRef]
- Yang, F.; Sang, M.L.; Lu, J.R.; Zhao, H.M.; Zou, Y.K.; Wu, W.; Yu, Y.; Liu, Y.W.; Ma, W.C.; Zhang, Y.; et al. Somalactams A-D: Anti-inflammatory macrolide lactams with unique ring systems from an arctic Actinomycete Strain. Angew. Chem.-Int. Ed. 2023, 62, e202218085. [Google Scholar] [CrossRef]
- Tao, H.; Lauterbach, L.; Bian, G.K.; Chen, R.; Hou, A.W.; Mori, T.; Cheng, S.; Hu, B.; Lu, L.; Mu, X.; et al. Discovery of non-squalene triterpenes. Nature 2022, 606, 414–419. [Google Scholar] [CrossRef] [PubMed]
- Mullowney, M.W.; Duncan, K.R.; Elsayed, S.S.; Garg, N.; van der Hooft, J.J.J.; Martin, N.I.; Meijer, D.; Terlouw, B.R.; Biermann, F.; Blin, K.; et al. Artificial intelligence for natural product drug discovery. Nat. Rev. Drug Discov. 2023, 22, 895–916. [Google Scholar] [CrossRef] [PubMed]
- Xue, H.T.; Stanley-Baker, M.; Kong, A.W.K.; Li, H.L.; Goh, W.W.B. Data considerations for predictive modeling applied to the discovery of bioactive natural products. Drug Discov. Today 2022, 27, 2235–2243. [Google Scholar] [CrossRef] [PubMed]
- Schneider, P.; Altmann, K.H.; Schneider, G. Generating bioactive natural product-inspired molecules with machine intelligence. Chimia 2022, 76, 396–401. [Google Scholar] [CrossRef]
- Arora, S.; Chettri, S.; Percha, V.; Kumar, D.; Latwal, M. Artifical intelligence: A virtual chemist for natural product drug discovery. J. Biomol. Struct. Dyn. 2024, 42, 3826–3835. [Google Scholar] [CrossRef]
Method | Workflow | Applications | Advancements |
---|---|---|---|
Genome mining combined with NMR-based isotope labeling |
|
| The comprehensive use of antiSMASH and various other tools enables more accurate identification of RiPPs BGC types. |
Genome mining combined with SMART and molecular networking |
|
| SMART has a stronger ability to capture characteristic information in mixtures and can directly associate compounds with BGC. |
Genome mining combined with two-dimensional spectral features |
|
| Capturing single-key correlations enhances detection sensitivity, and a small trial is sufficient to determine whether a gene is successfully expressed. |
Genome mining combined with isotope feeding and IsoAnalyst |
| IsoAnalyst can accurately capture the connections between molecules that are biologically related but have significantly different spectral characteristics. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Yu, J.; Wang, C.; Hua, Y.; Wang, H.; Chen, J. The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024. Mar. Drugs 2025, 23, 261. https://doi.org/10.3390/md23070261
Wang Z, Yu J, Wang C, Hua Y, Wang H, Chen J. The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024. Marine Drugs. 2025; 23(7):261. https://doi.org/10.3390/md23070261
Chicago/Turabian StyleWang, Zhaochao, Juanjuan Yu, Chenjie Wang, Yi Hua, Hong Wang, and Jianwei Chen. 2025. "The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024" Marine Drugs 23, no. 7: 261. https://doi.org/10.3390/md23070261
APA StyleWang, Z., Yu, J., Wang, C., Hua, Y., Wang, H., & Chen, J. (2025). The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024. Marine Drugs, 23(7), 261. https://doi.org/10.3390/md23070261