Genome Survey Sequencing of Indigofera pseudotinctoria and Identification of Its SSR Markers
Abstract
1. Introduction
2. Materials and Methods
2.1. Plant Materials
2.2. Genome Size Estimation by Fow Cytometry
2.3. Genome Survey Sequencing and Quality Control
2.4. Genome Sequencing Assembly and GC Content
2.5. Genomic SSR Identification and PCR Amplification
2.6. Gene Prediction and Annotation
3. Results
3.1. Genome Size Estimation by Fow Cytometry
3.2. Genome Sequencing and Sequence Assembly
3.3. Genome Size Estimation and GC Content
3.4. Identifcation and Verification of SSRs
3.5. Gene Prediction and Annotation
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zhou, S.M.; Wang, F.; Yan, S.Y.; Zhu, Z.M.; Gao, X.F.; Zhao, X.L. Phylogenomics and plastome evolution of Indigofera (Fabaceae). Front. Plant Sci. 2023, 14, 1186598. [Google Scholar] [CrossRef] [PubMed]
- Cho, S.E.; Zhao, T.T.; Choi, I.Y.; Choi, Y.J.; Shin, H.D. First Report of Powdery Mildew Caused by Erysiphe trifoliorum on Indigofera amblyantha in Korea. Plant Dis. 2016, 100, 1954. [Google Scholar] [CrossRef]
- Schrire, B. A review of tribe Indigofereae (Leguminosae–Papilionoideae) in Southern Africa (including South Africa, Lesotho, Swaziland & Namibia; excluding Botswana). S. Afr. J. Bot. 2013, 89, 281–283. [Google Scholar]
- Zhao, J.M.; Chen, J.; Xiong, Y.; He, W.; Xiong, Y.L.; Xu, Y.D.; Ma, H.Z.; Yu, Q.Q.; Li, Z.; Liu, L.; et al. Organelle genomes of Indigofera amblyantha and Indigofera pseudotinctoria: Comparative genome analysis, and intracellular gene transfer. Ind. Crops Prod. 2023, 198, 116674. [Google Scholar] [CrossRef]
- Gerometta, E.; Grondin, I.; Smadja, J.; Frederich, M.; Gauvin-Bialecki, A. A review of traditional uses, phytochemistry and pharmacology of the genus Indigofera. J. Ethnopharmacol. 2020, 253, 112608. [Google Scholar] [CrossRef]
- Bakasso, S.; Lamien-Meda, A.; Lamien, C.E.; Kiendrebeogo, M.; Millogo, J.; Ouedraogo, A.G.; Nacoulma, O.G. Polyphenol contents and antioxidant activities of five Indigofera species (Fabaceae) from Burkina Faso. Pak. J. Biol. Sci. 2008, 11, 1429–1435. [Google Scholar] [CrossRef]
- Wang, W.L.; Xu, J.F.; Fang, H.Y.; Li, Z.J.; Li, M.H. Advances and challenges in medicinal plant breeding. Plant Sci. 2020, 298, 110573. [Google Scholar] [CrossRef]
- Chen, S.L.; Yu, H.; Luo, H.M.; Wu, Q.; Li, C.F.; Steinmetz, A. Conservation and sustainable use of medicinal plants: Problems, progress, and prospects. Chin. Med. 2016, 11, 37. [Google Scholar] [CrossRef]
- Taheri, S.; Abdullah, T.L.; Yusop, M.R.; Hanafi, M.M.; Sahebi, M.; Azizi, P.; Shamshiri, R.S. Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS) Data in Plants. Molecules 2018, 23, 399. [Google Scholar] [CrossRef] [PubMed]
- Younis, A.; Ramzan, F.; Ramzan, Y.; Zulfiqar, F.; Ahsan, M.; Lim, K.B. Molecular Markers Improve Abiotic Stress Tolerance in Crops: A Review. Plants 2020, 9, 1347. [Google Scholar] [CrossRef] [PubMed]
- Guo, L.N.; Gao, X.F. Genetic diversity and population structure of Indigofera szechuensis complex (Fabaceae) based on EST-SSR markers. Gene 2017, 624, 26–33. [Google Scholar] [CrossRef]
- Ellis, J.R.; Burke, J.M. EST-SSRs as a resource for population genetic analyses. Heredity 2007, 99, 125–132. [Google Scholar] [CrossRef]
- Zhang, L.; Yuan, D.; Yu, S.; Li, Z.; Cao, Y.; Miao, Z.; Qian, H.; Tang, K. Preference of simple sequence repeats in coding and non-coding regions of Arabidopsis thaliana. Bioinformatics 2004, 20, 1081–1086. [Google Scholar] [CrossRef]
- Zalapa, J.E.; Cuevas, H.; Zhu, H.Y.; Steffan, S.; Senalik, D.; Zeldin, E.; Mccown, B.; Harbut, R.; Simon, P. Using next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences. Am. J. Bot. 2012, 99, 193–208. [Google Scholar] [CrossRef]
- Chen, Q.F.; Lan, C.W.; Zhao, L.; Wang, J.X.; Chen, B.S.; Chen, Y.P. Recent advances in sequence assembly: Principles and applications. Brief. Funct. Genom. 2017, 16, 361–378. [Google Scholar] [CrossRef]
- Armstrong, J.; Fiddes, I.T.; Diekhans, M.; Paten, B. Whole-Genome Alignment and Comparative Annotation. Annu. Rev. Anim. Biosci. 2019, 7, 41–64. [Google Scholar] [CrossRef]
- Liu, Y.H.; Zeng, Y.T.; Li, Y.M.; Liu, Z.; Wang, K.L.; Espley, R.V.; Allan, A.C.; Zhang, J.L. Genomic survey and gene expression analysis of the MYB-related transcription factor superfamily in potato (Solanum tuberosum L.). Int. J. Biol. Macromol. 2020, 164, 2450–2464. [Google Scholar] [CrossRef]
- Li, Y.L.; Sun, A.L.; Wu, Q.; Zou, X.X.; Chen, F.L.; Cai, R.Q.; Xie, H.; Zhang, M.; Guo, X.H. Comprehensive genomic survey, structural classification and expression analysis of C2H2-type zinc finger factor in wheat (Triticum aestivum L.). BMC Plant Biol. 2021, 21, 380. [Google Scholar] [CrossRef]
- Ouyang, Y.D.; Huang, X.L.; Lu, Z.H.; Yao, J.L. Genomic survey, expression profile and co-expression network analysis of OsWD40 family in rice. BMC Genom. 2012, 13, 100. [Google Scholar] [CrossRef]
- Dolezel, J.; Bartos, J. Plant DNA Flow Cytometry and Estimation of Nuclear Genome Size. Ann. Bot. 2005, 95, 99–110. [Google Scholar] [CrossRef]
- Project, I.R.G.S. The map-based sequence of the rice genome. Nature 2005, 436, 793–800. [Google Scholar] [CrossRef]
- Miller, J.R.; Koren, S.; Sutton, G. Assembly algorithms for next-generation sequencing data. Genomics 2010, 95, 315–327. [Google Scholar] [CrossRef]
- Simpson, J.T.; Wong, K.; Jackman, S.D.; Schein, J.E.; Jones, S.J.M.; Birol, I. ABySS: A parallel assembler for short read sequence data. Genome Res. 2009, 19, 1117. [Google Scholar] [CrossRef]
- Varshney, R.K.; Chen, W.B.; Li, Y.P.; Bharti, A.K.; Saxena, R.K.; Schlueter, J.A.; Donoghue, M.T.A.; Azam, S.; Fan, G.Y.; Whaley, A.M.; et al. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat. Biotechnol. 2011, 30, 83–89. [Google Scholar] [CrossRef]
- Cheung, M.S.; Down, T.A.; Latorre, I.; Ahringer, J. Systematic bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Res. 2011, 39, e103. [Google Scholar] [CrossRef]
- Simão, F.A.; Waterhouse, R.M.; Panagiotis, I.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015, 31, 3210–3212. [Google Scholar] [CrossRef]
- Grabherr, M.G.; Haas, B.J.; Yassour, M.; Levin, J.Z.; Thompson, D.A.; Amit, I.; Adiconis, X.; Fan, L.; Raychowdhury, R.; Zeng, Q.D.; et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 2011, 29, 644–652. [Google Scholar] [CrossRef]
- Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.H.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997, 25, 3389–3402. [Google Scholar] [CrossRef]
- Galperin, M.Y.; Vera Alvarez, R.; Karamycheva, S.; Makarova, K.S.; Wolf, Y.I.; Landsman, D.; Koonin, E.V. COG database update 2024. Nucleic Acids Res. 2024, 53, 356–363. [Google Scholar] [CrossRef]
- Ashburner, M.; Ball, C.A.; Blake, J.A.; Botstein, D.; Butler, H.; Cherry, J.M.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene Ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef]
- Consortium, G.O. The Gene Ontology knowledgebase in 2023. Genetics 2023, 224, iyad031. [Google Scholar] [CrossRef]
- Kanehisa, M.; Furumichi, M.; Sato, Y.; Matsuura, Y.; Ishiguro-Watanabe, M. KEGG: Biological systems database as a model of the real world. Nucleic Acids Res. 2024, 53, 672–677. [Google Scholar] [CrossRef]
- Zhang, Y.Y.; An, Y.; Lin, F.; Ma, Q.Y.; Zhou, X.Y.; Jin, L.; Li, P.F.; Wang, Z.S. Estimation of Genome Size of Parrotia C. A. Mey. by Flow Cytometry and K-mer Analysis. J. Plant Genet. Resour. 2020, 22, 561–570. [Google Scholar]
- Huang, A.J.; Zhou, J.Y.; Li, T.Z.; Xing, Y.D.; Gao, F.; Zhou, Y.J. Flow cytometry and K-mer analysis estimates of genome size of Sophora alopecuroide. Chin. Tradit. Herb. Drugs 2019, 50, 6098–6102. [Google Scholar]
- Huang, S.W.; Li, R.Q.; Zhang, Z.H.; Li, L.; Gu, X.F. The genome of the cucumber, Cucumis sativus L. Nat. Genet. 2009, 41, 1275–1281. [Google Scholar]
- Zhang, X.Y.; Liu, Z.X.; Liao, B.S.; Xiao, S.M.; Xu, J.; Sheng, W. Estimation of Genome Size of Ginseng Based on Herbgenomics by Flow Cytometric Analysis and High-throughput Sequence. World Sci. Technol./Mod. Tradit. Chin. Med. Mater. Medica 2017, 19, 1724–1728. [Google Scholar]
- Temsch, E.M.; Koutecký, P.; Urfus, T.; Šmarda, P.; Doležel, J. Reference standards for flow cytometric estimation of absolute nuclear DNA content in plants. Cytom. Part A 2022, 101, 710–724. [Google Scholar] [CrossRef]
- Fan, Y.; Zhang, C.L.; Wu, W.D.; He, W.; Zhang, L.; Ma, X. Analysis of Genetic Diversity and Structure Pattern of Indigofera Pseudotinctoria in Karst Habitats of the Wushan Mountains Using AFLP Markers. Molecules 2017, 22, 1734. [Google Scholar] [CrossRef]
- Amiteye, S. Basic concepts and methodologies of DNA marker systems in plant molecular breeding. Heliyon 2021, 7, e08093. [Google Scholar] [CrossRef]
- Otao, T.; Kobayashi, T.; Uehara, K. Development and characterization of 14 microsatellite markers for Indigofera pseudotinctoria (Fabaceae). Appl. Plant Sci. 2016, 4, apps.1500110. [Google Scholar] [CrossRef]
- Du, Q.Z.; Pan, W.; Xu, B.H.; Li, B.L.; Zhang, D.Q. Polymorphic simple sequence repeat (SSR) loci within cellulose synthase (PtoCesA) genes are associated with growth and wood properties in Populus tomentosa. New Phytol. 2013, 197, 763–776. [Google Scholar] [CrossRef]
- Shangguan, L.F.; Han, J.; Kayesh, E.; Sun, X.; Zhang, C.Q.; Pervaiz, T.; Wen, X.C.; Fang, J.G. Evaluation of Genome Sequencing Quality in Selected Plant Species Using Expressed Sequence Tags. PLoS ONE 2013, 8, e69890. [Google Scholar] [CrossRef]
- Li, G.Q.; Song, L.X.; Jin, C.Q.; Li, M.; Gong, S.P.; Wang, Y.F. Genome survey and SSR analysis of Apocynum venetum. Biosci. Rep. 2019, 39, BSR20190146. [Google Scholar] [CrossRef] [PubMed]
- Claros, M.G.; Rocío, B.; Darío, G.F.; Benzerki, H.; Noé, F.P. Why Assembling Plant Genome Sequences Is So Challenging. Biology 2012, 1, 439–459. [Google Scholar] [CrossRef]
- Baxevanis, A.D.; Bateman, A. The Importance of Biological Databases in Biological Discovery. Curr. Protoc. Bioinform. 2015, 50, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Chandel, N.S. Amino Acid Metabolism. Cold Spring Harb. Perspect. Biol. 2021, 13, a040584. [Google Scholar] [CrossRef] [PubMed]
Locus | Repeat Motif | Primer Sequence (5′-3′) | |
---|---|---|---|
Forward | Reverse | ||
IP380501 | (GTT)5 | AATTTTTCCACGGGGTCTTC | GTTGGTTTTATCCGTCGCTT |
IP553716 | (ATA)5 | ATTGGTTGTGTGGACCGAAT | TCAAATTATTCCCTTATTCAAATTCA |
IP885913 | (ATTT)5 | AATACAGGTGAGCAGTGCGA | TGAAATTCCACCACAATGGA |
IP1040384 | (AAAT)5 | AATTGTCCTCGTGTTGTGAGG | AATGGTGCGAATTTTATGCTT |
IP1047571 | (GAT)5 | TCCTAAGCCACCACAAATCC | CCATCTCCTACCTTCCAACTTC |
IP1227591 | (AGA)5 | ACGAATCAGAAGAACAGGGC | TCTCTCACAAACACCGACCA |
IP1434377 | (GGC)5 | TTCGATTTGGATTTGCACTG | AGAATGTTCTGCACCGTTCC |
IP2408442 | (TATT)7 | CGCTGTTTAGGTTAACATTCCA | ACATCCCCATTAACTCAACATAG |
IP10099562 | (TAC)5 | GGCCCTTTTCATTCCTTTTC | ACAACAAGGAGCTCTTCCCA |
IP10099615 | (CTT)8 | TGCAGCAATGATGACATCTG | TTGGCACCACATCAAACAGT |
IP10125461 | (AAT)7 | GGAAGCTACTCTGCATCGGA | CATGCTCATCTCAGGCATGT |
IP10125944 | (TCT)6 | ACCATTAGGCAGAGAGGCAA | TTGCACATGATTCGTTCTCC |
IP10130648 | (TAC)5 | TGTCAGCTTTTGAAGCATGG | GGCCAAAAGTGCAACATTCT |
IP10130698 | (CCT)6 | CCTCCACCTCCCATGTAGAA | AGCCACAAGCTACCTCAGGA |
IP10130710 | (CAA)5 | GGGGTTATTCAGTCCCGTTT | GACGCGACCCAATTGTAACT |
IP10130769 | (GAT)6 | CGAGAGGTTAGGGGGAGATT | CCCACAAATTAAGGGCATGA |
IP10130792 | (ACA)7 | TTGCCACAAATACGCAAAAA | TTCTCAGGTCTGCTCTCGCT |
Genome Size (Mb) Mean ± SD | CV (%) of Standard | CV (%) of Sample |
---|---|---|
920 ± 2 | 4.73 | 3.56 |
Number of Raw Reads | Raw Base (Gbp) | Clean Base (Gbp) | Q20 (%) | Q30 (%) | GC Content (%) |
---|---|---|---|---|---|
365,359,150 | 54.804 | 48.952 | 98.18 | 93.51 | 35.96 |
Total Length (bp) | Total Number | Max Length (bp) | N50 Length (bp) | N75 Length (bp) | GC Content (%) |
---|---|---|---|---|---|
431,452,197 | 553,021 | 90,236 | 3506 | 763 | 34.3 |
Kmer | Depth | n_kmer | Genome Size | Heterozygous Ratio | Repeat Sequence Content |
---|---|---|---|---|---|
17 | 68 | 43,729,058,443 | 6.37 × 108 | 0.98% | 66.30% |
Searching Item | Number | Ratio (%) |
---|---|---|
Total number of sequences examined | 548,189 | |
The total size of examined sequences (bp) | 430,490,629 | |
Total number of identifed SSRs | 240,659 | 100 |
Number of SSR containing sequences | 122,195 | 50.78 |
Number of sequences containing more than 1 SSR | 46,239 | 19.21 |
Number of SSRs present in the compound formation | 28,086 | 11.67 |
Mononucleotide | 180,491 | 74.50 |
Dinucleotide | 35,978 | 14.95 |
Trinucleotide | 20,213 | 8.40 |
Tetranucleotide | 3140 | 1.30 |
Pentanucleotide | 515 | 0.21 |
Hexanucleotide | 322 | 0.13 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, J.; Ran, Q.; Xu, Y.; Zhao, J.; Ma, X.; He, W.; Fan, Y. Genome Survey Sequencing of Indigofera pseudotinctoria and Identification of Its SSR Markers. Genes 2025, 16, 991. https://doi.org/10.3390/genes16090991
Chen J, Ran Q, Xu Y, Zhao J, Ma X, He W, Fan Y. Genome Survey Sequencing of Indigofera pseudotinctoria and Identification of Its SSR Markers. Genes. 2025; 16(9):991. https://doi.org/10.3390/genes16090991
Chicago/Turabian StyleChen, Jing, Qifan Ran, Yuandong Xu, Junming Zhao, Xiao Ma, Wei He, and Yan Fan. 2025. "Genome Survey Sequencing of Indigofera pseudotinctoria and Identification of Its SSR Markers" Genes 16, no. 9: 991. https://doi.org/10.3390/genes16090991
APA StyleChen, J., Ran, Q., Xu, Y., Zhao, J., Ma, X., He, W., & Fan, Y. (2025). Genome Survey Sequencing of Indigofera pseudotinctoria and Identification of Its SSR Markers. Genes, 16(9), 991. https://doi.org/10.3390/genes16090991