A Chromosome-Scale Genome Assembly of Mitragyna speciosa (Kratom) and the Assessment of Its Genetic Diversity in Thailand
Abstract
:Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Genome Size Estimation
2.2. Plant Materials and DNA/RNA Isolation
2.3. Genome and Isoform Sequencing (Iso-seq) Library Preparation
2.4. Hi-C Library Preparation and Sequencing
2.5. PacBio Draft Assembly and Hi-C Scaffolding
2.6. Genome Assembly Evaluation
2.7. Phylogenetic Analyses and Comparative Genomics
2.8. Genome Synteny Analysis
2.9. Population Structure and Genetic Diversity Analyses
3. Results
3.1. M. speciosa Genome Assembly and Annotation
3.2. Identification of Repetitive Elements in the M. speciosa Genome
3.3. Comparative Genomics and Phylogenetic Analyses
3.4. Genetic Diversity and Population Structure in M. speciosa
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Davis, A. Rubiaceae of Thailand—A pictorial guide to indigenous and cultivated genera. Bot. J. Linn. Soc. 2006, 152, 131–132. [Google Scholar] [CrossRef]
- Suwanlert, S. A study of kratom eaters in Thailand. Bull. Narc. 1975, 27, 21–27. [Google Scholar]
- Cinosi, E.; Martinotti, G.; Simonato, P.; Singh, D.; Demetrovics, Z.; Roman-Urrestarazu, A.; Bersani, F.S.; Vicknasingam, B.; Piazzon, G.; Li, J.-H. Following “the roots” of Kratom (Mitragyna speciosa): The evolution of an enhancer from a traditional use to increase work and productivity in Southeast Asia to a recreational psychoactive drug in western countries. BioMed Res. Int. 2015, 968786. [Google Scholar] [CrossRef] [Green Version]
- Grundmann, O. Patterns of kratom use and health impact in the US—Results from an online survey. Drug Alcohol Depend. 2017, 176, 63–70. [Google Scholar] [CrossRef] [PubMed]
- Kiehn, M. Chromosome Survey of the Rubiaceae. Ann. Mo. Bot. Gard. 1995, 82, 398–408. [Google Scholar] [CrossRef]
- Brose, J.; Lau, K.H.; Dang, T.T.T.; Hamilton, J.P.; Martins, L.d.V.; Hamberger, B.; Hamberger, B.; Jiang, J.; O’Connor, S.E.; Buell, C.R. The Mitragyna speciosa (Kratom) Genome: A resource for data-mining potent pharmaceuticals that impact human health. G3 Genes 2021, 11, jkab058. [Google Scholar] [CrossRef]
- Flores-Bocanegra, L.; Raja, H.A.; Graf, T.N.; Augustinović, M.; Wallace, E.D.; Hematian, S.; Kellogg, J.J.; Todd, D.A.; Cech, N.B.; Oberlies, N.H. The Chemistry of Kratom [Mitragyna speciosa]: Updated Characterization Data and Methods to Elucidate Indole and Oxindole Alkaloids. J. Nat. Prod. 2020, 83, 2165–2177. [Google Scholar] [CrossRef] [PubMed]
- Beckett, A.; Shellard, E.; Tackie, A. THE MITRAGYNA SPECIES OF ASIA–Part IV. The alkaloids of the leaves of Mitragyna speciosa Korth.. Isolation of Mitragynine and Speciofoline1. Planta Med. 1965, 13, 241–246. [Google Scholar] [CrossRef]
- Takayama, H.; Ishikawa, H.; Kurihara, M.; Kitajima, M.; Aimi, N.; Ponglux, D.; Koyama, F.; Matsumoto, K.; Moriyama, T.; Yamamoto, L.T.; et al. Studies on the Synthesis and Opioid Agonistic Activities of Mitragynine-Related Indole Alkaloids: Discovery of Opioid Agonists Structurally Different from Other Opioid Ligands. J. Med. Chem. 2002, 45, 1949–1956. [Google Scholar] [CrossRef]
- Karunakaran, T.; Ngew, K.Z.; Zailan, A.A.D.; Mian Jong, V.Y.; Abu Bakar, M.H. The Chemical and Pharmacological Properties of Mitragynine and Its Diastereomers: An Insight Review. Front. Pharmacol. 2022, 13, 805986. [Google Scholar] [CrossRef] [PubMed]
- Gibbons, S.; Arunotayanun, W. Chapter 14—Natural Product (Fungal and Herbal) Novel Psychoactive Substances. In Novel Psychoactive Substances; Dargan, P.I., Wood, D.M., Eds.; Academic Press: Boston, MA, USA, 2013; pp. 345–362. [Google Scholar]
- Dolezel, J.; Bartos, J. Plant DNA flow cytometry and estimation of nuclear genome size. Ann. Bot. 2005, 95, 99–110. [Google Scholar] [CrossRef]
- Galbraith, D.W.; Harkins, K.R.; Maddox, J.M.; Ayres, N.M.; Sharma, D.P.; Firoozabady, E. Rapid Flow Cytometric Analysis of the Cell Cycle in Intact Plant Tissues. Science 1983, 220, 1049–1051. [Google Scholar] [CrossRef]
- Pootakham, W.; Naktang, C.; Sonthirod, C.; Kongkachana, W.; Narong, N.; Sangsrakru, D.; Maknual, C.; Jiumjamrassil, D.; Chumriang, P.; Tangphatsornroung, S. Chromosome-level genome assembly of the Indian mangrove (Ceriops tagal) revealed a genome-wide duplication event predating the divergence of Rhizophoraceae mangroves. Plant Genome 2022, 15, e20217. [Google Scholar] [CrossRef]
- Koren, S.; Walenz, B.P.; Berlin, K.; Miller, J.R.; Bergman, N.H.; Phillippy, A.M. Canu: Scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017, 27, 722–736. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Burton, J.N.; Adey, A.; Patwardhan, R.P.; Qiu, R.; Kitzman, J.O.; Shendure, J. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 2013, 31, 1119–1125. [Google Scholar] [CrossRef] [PubMed]
- Simão, F.A.; Waterhouse, R.M.; Ioannidis, P.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015, 31, 3210–3212. [Google Scholar] [CrossRef] [Green Version]
- Kriventseva, E.V.; Tegenfeldt, F.; Petty, T.J.; Waterhouse, R.M.; Simao, F.A.; Pozdnyakov, I.A.; Ioannidis, P.; Zdobnov, E.M. OrthoDB v8: Update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res. 2015, 43, D250–D256. [Google Scholar] [CrossRef]
- Price, A.L.; Jones, N.C.; Pevzner, P.A. De novo identification of repeat families in large genomes. Bioinformatics 2005, 21, i351–i358. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bao, Z.; Eddy, S.R. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 2002, 12, 1269–1276. [Google Scholar] [CrossRef] [Green Version]
- Haas, B.J.; Salzberg, S.L.; Zhu, W.; Pertea, M.; Allen, J.E.; Orvis, J.; White, O.; Buell, C.R.; Wortman, J.R. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 2008, 9, R7. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wu, T.; Watanabe, C. GMAP: A genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 2005, 21, 1859–1875. [Google Scholar] [CrossRef]
- Huang, X.; Adams, M.D.; Zhou, H.; Kerlavage, A.R. A tool for analyzing and annotating genomic sequences. Genomics 1997, 46, 37–45. [Google Scholar] [CrossRef] [PubMed]
- Stanke, M.; Steinkamp, R.; Waack, S.; Morgenstern, B. AUGUSTUS: A web server for gene finding in eukaryotes. Nucleic Acids Res. 2004, 32, W309–W312. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Emms, D.M.; Kelly, S. OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biol. 2019, 20, 238. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Stamatakis, A. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22, 2688–2690. [Google Scholar] [CrossRef] [Green Version]
- Edgar, R.C. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32, 1792–1797. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Capella-Gutiérrez, S.; Silla-Martínez, J.M.; Gabaldón, T. trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 2009, 25, 1972–1973. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Darriba, D.; Posada, D.; Kozlov, A.M.; Stamatakis, A.; Morel, B.; Flouri, T. ModelTest-NG: A new and scalable tool for the selection of DNA and protein evolutionary models. Mol. Biol. Evol. 2020, 37, 291–294. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yang, Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007, 24, 1586–1591. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Xi, Z.; Ruhfel, B.R.; Schaefer, H.; Amorim, A.M.; Sugumaran, M.; Wurdack, K.J.; Endress, P.K.; Matthews, M.L.; Stevens, P.F.; Mathews, S.; et al. Phylogenomics and a posteriori data partitioning resolve the Cretaceous angiosperm radiation Malpighiales. Proc. Natl. Acad. Sci. USA 2012, 109, 17519–17524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Davis, C.C.; Webb, C.O.; Wurdack, K.J.; Jaramillo, C.A.; Donoghue, M.J. Explosive Radiation of Malpighiales Supports a Mid-Cretaceous Origin of Modern Tropical Rain Forests. Am. Nat. 2005, 165, E36–E65. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Tang, H.; Debarry, J.D.; Tan, X.; Li, J.; Wang, X.; Lee, T.H.; Jin, H.; Marler, B.; Guo, H.; et al. MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012, 40, e49. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Krzywinski, M.I.; Schein, J.E.; Birol, I.; Connors, J.; Gascoyne, R.; Horsman, D.; Jones, S.J.; Marra, M.A. Circos: An information aesthetic for comparative genomics. Genome Res. 2009, 19, 1639–1645. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Stoeckert, C.J., Jr.; Roos, D.S. OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res. 2003, 13, 2178–2189. [Google Scholar] [CrossRef] [Green Version]
- McKenna, A.; Hanna, M.; Banks, E.; Sivachenko, A.; Cibulskis, K.; Kernytsky, A.; Garimella, K.; Altshuler, D.; Gabriel, S.; Daly, M.; et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010, 20, 1297–1303. [Google Scholar] [CrossRef] [Green Version]
- Paradis, E.; Claude, J.; Strimmer, K. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics 2004, 20, 289–290. [Google Scholar] [CrossRef] [Green Version]
- Falush, D.; Stephens, M.; Pritchard, J.K. Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 2003, 164, 1567–1587. [Google Scholar] [CrossRef]
- Liu, K.; Muse, S.V. PowerMarker: An integrated analysis environment for genetic marker analysis. Bioinformatics 2005, 21, 2128–2129. [Google Scholar] [CrossRef] [Green Version]
- Bradbury, P.J.; Zhang, Z.; Kroon, D.E.; Casstevens, T.M.; Ramdoss, Y.; Buckler, E.S. TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 2007, 23, 2633–2635. [Google Scholar] [CrossRef]
- Villanueva, R.A.M.; Chen, Z.J. ggplot2: Elegant Graphics for Data Analysis (2nd ed.). Meas. Interdiscip. Res. Perspect. 2019, 17, 160–167. [Google Scholar] [CrossRef]
- Edger, P.P.; Pires, J.C. Gene and genome duplications: The impact of dosage-sensitivity on the fate of nuclear genes. Chromosome Res. 2009, 17, 699–717. [Google Scholar] [CrossRef] [PubMed]
- Bawin, Y.; Ruttink, T.; Staelens, A.; Haegeman, A.; Stoffelen, P.; Mwanga Mwanga, J.-C.I.; Roldán-Ruiz, I.; Honnay, O.; Janssens, S.B. Phylogenomic analysis clarifies the evolutionary origin of Coffea arabica. J. Syst. Evol. 2021, 59, 953–963. [Google Scholar] [CrossRef]
- Pritchard, J.K.; Stephens, M.; Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 2000, 155, 945–959. [Google Scholar] [CrossRef] [PubMed]
- Evanno, G.; Regnaut, S.; Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 2005, 14, 2611–2620. [Google Scholar] [CrossRef] [PubMed]
PacBio | PacBio + HiC | |
---|---|---|
N50 scaffold size (bases) | 922,929 | 26,436,849 |
L50 scaffold number | 200 | 12 |
N75 scaffold size (bases) | 339,531 | 23,267,248 |
L75 scaffold number | 502 | 19 |
N90 scaffold size (bases) | 41,110 | 56,307 |
L90 scaffold number | 1456 | 159 |
Assembly size (bases) | 692,306,703 | 692,445,403 |
Number of scaffolds | 4259 | 2888 |
Number of scaffolds ≥ 100 kb | 862 | 64 |
Number of scaffolds ≥ 1 Mb | 173 | 24 |
Number of scaffolds ≥ 10 Mb | 0 | 22 |
Longest scaffold (bases) | 7,719,426 | 34,865,628 |
% N | 0 | 0.02 |
GC content (%) | 34.59 | 34.59 |
BUSCO evaluation (% completeness) | - | 98.4 |
Types of Repeats | Bases (Mb) | % of the Assembly | % of Total Repeats |
---|---|---|---|
DNA transposons | 9.98 | 1.44 | 3.92 |
Retrotransposons: | |||
LINE | 17.58 | 2.54 | 2.31 |
SINE | 0.001 | 0.00 | 0.00 |
LTR: Copia | 44.08 | 6.36 | 12.93 |
LTR: Gypsy | 58.74 | 8.48 | 16.96 |
LTR: Others | 1.98 | 0.28 | 0.56 |
Simple sequence repeats | 27.71 | 4.00 | 7.95 |
Others | 194.54 | 28.11 | 55.37 |
Total | 354.61 | 51.21 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pootakham, W.; Yoocha, T.; Jomchai, N.; Kongkachana, W.; Naktang, C.; Sonthirod, C.; Chowpongpang, S.; Aumpuchin, P.; Tangphatsornruang, S. A Chromosome-Scale Genome Assembly of Mitragyna speciosa (Kratom) and the Assessment of Its Genetic Diversity in Thailand. Biology 2022, 11, 1492. https://doi.org/10.3390/biology11101492
Pootakham W, Yoocha T, Jomchai N, Kongkachana W, Naktang C, Sonthirod C, Chowpongpang S, Aumpuchin P, Tangphatsornruang S. A Chromosome-Scale Genome Assembly of Mitragyna speciosa (Kratom) and the Assessment of Its Genetic Diversity in Thailand. Biology. 2022; 11(10):1492. https://doi.org/10.3390/biology11101492
Chicago/Turabian StylePootakham, Wirulda, Thippawan Yoocha, Nukoon Jomchai, Wasitthee Kongkachana, Chaiwat Naktang, Chutima Sonthirod, Srimek Chowpongpang, Panyavut Aumpuchin, and Sithichoke Tangphatsornruang. 2022. "A Chromosome-Scale Genome Assembly of Mitragyna speciosa (Kratom) and the Assessment of Its Genetic Diversity in Thailand" Biology 11, no. 10: 1492. https://doi.org/10.3390/biology11101492
APA StylePootakham, W., Yoocha, T., Jomchai, N., Kongkachana, W., Naktang, C., Sonthirod, C., Chowpongpang, S., Aumpuchin, P., & Tangphatsornruang, S. (2022). A Chromosome-Scale Genome Assembly of Mitragyna speciosa (Kratom) and the Assessment of Its Genetic Diversity in Thailand. Biology, 11(10), 1492. https://doi.org/10.3390/biology11101492