Next Article in Journal
GDF6 Knockdown in a Family with Multiple Synostosis Syndrome and Speech Impairment
Previous Article in Journal
A Common CDH13 Variant Is Associated with Low Agreeableness and Neural Responses to Working Memory Tasks in ADHD
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Complete Chloroplast Genomes of Fagus sylvatica L. Reveal Sequence Conservation in the Inverted Repeat and the Presence of Allelic Variation in NUPTs

1
Department of Genetics, Faculty of Biological Sciences, Kazimierz Wielki University, Chodkiewicza 30, 85-064 Bydgoszcz, Poland
2
Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, 60325 Frankfurt am Main, Germany
3
Department of Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe University, Max-von-Laue-Str. 13, 60483 Frankfurt am Main, Germany
4
LOEWE Centre for Translational Biodiversity Genomics, Georg-Voigt-Str. 14-16, 60325 Frankfurt am Main, Germany
*
Author to whom correspondence should be addressed.
Genes 2021, 12(9), 1357; https://doi.org/10.3390/genes12091357
Submission received: 24 May 2021 / Revised: 21 August 2021 / Accepted: 27 August 2021 / Published: 29 August 2021
(This article belongs to the Section Plant Genetics and Genomics)

Abstract

:
Growing amounts of genomic data and more efficient assembly tools advance organelle genomics at an unprecedented scale. Genomic resources are increasingly used for phylogenetic analyses of many plant species, but are less frequently used to investigate within-species variability and phylogeography. In this study, we investigated genetic diversity of Fagus sylvatica, an important broadleaved tree species of European forests, based on complete chloroplast genomes of 18 individuals sampled widely across the species distribution. Our results confirm the hypothesis of a low cpDNA diversity in European beech. The chloroplast genome size was remarkably stable (158,428 ± 37 bp). The polymorphic markers, 12 microsatellites (SSR), four SNPs and one indel, were found only in the single copy regions, while inverted repeat regions were monomorphic both in terms of length and sequence, suggesting highly efficient suppression of mutation. The within-individual analysis of polymorphisms showed >9k of markers which were proportionally present in gene and non-gene areas. However, an investigation of the frequency of alternate alleles revealed that the source of this diversity originated likely from nuclear-encoded plastome remnants (NUPTs). Phylogeographic and Mantel correlation analysis based on the complete chloroplast genomes exhibited clustering of individuals according to geographic distance in the first distance class, suggesting that the novel markers and in particular the cpSSRs could provide a more detailed picture of beech population structure in Central Europe.

1. Introduction

Chloroplasts not only play a key role in photosynthesis but also other metabolic pro-cesses of green plants [1]. The generally maternal inheritance of the chloroplast genome in Angiosperms and relatively conserved gene content and order has made chloroplast genomes a valuable resource for phylogenetic and evolutionary studies [2,3]. Plant chloroplast genomes are mostly between 120 kb and 160 kb in length and usually have a quadripartite circular structure comprising of two regions of inverted repeats A and B (IR-A/IR-B), separated by a large single-copy (LSC) region and a small single-copy (SSC) region [4]. Due to next-generation sequencing approaches, sequencing of chloroplast genomes for dozens or hundreds of individuals is now achievable through whole genome sequencing [5,6]. Chloroplast genome sequences have helped to elucidate the phylogenetic relationships and evolutionary history of many tree genera, including Acer [7], Prunus [8], Populus [9,10], Quercus [11,12] and Pinus [13]. With an increased availability of whole chloroplast sequences, numerous studies demonstrated the presence of variation among individuals within species, which includes SNPs, indels, inversions, translocations, copy number variations and also IR expansion, gene loss and intron retention [14]. The level of this variability is usually considered low, both in terms of composition, as well as in terms of the degree of variation within the different regions [14,15,16,17]. Interestingly, the individuals themselves have been reported to show variations of their chloroplast genomes through heteroplasmy—which can occur as a result of independent mutations or biparental inheritance of organelles in one organism [18]. While chloroplast genomes can be a good tool for phylogeographic analyses, such studies are currently limited to only few conifers [16,17].
European beech (Fagus sylvatica L.) is ecologically and economically one of the most important broadleaved tree species in Europe [19]. There are several molecular studies that evaluated genetic diversity and structure of European beech using chloroplast DNA [20,21,22,23,24,25,26,27]. Most of these studies showed a low level of chloroplast diversity and a rather homogeneous genetic structure in Central Europe, but none of them exploited the full potential resolution offered by complete chloroplast genomes. Thus, there is still lack of comparative analyses based on complete chloroplast genomes, which would allow to identify novel chloroplast polymorphisms and to detect genetic structure of European beech at a regional scale. Recently, Mishra and coauthors [26] evaluated three complete chloroplast genomes of beech from areas glaciated during the Weichselian glacial maximum and found a very low genetic variation with only two SNP and three indel positions. This raised the question of whether the low variation found was due to genetic empoverishment by founder effects at the leading edge of the recolonization or if chloroplast genetic diversity is generally low in European beech. To clarify this, there is a need to assess genetic diversity of complete chloroplast genome of European beech sampled from a wider range, which was the aim of the current study.
Here, we report 16 newly sequenced and assembled complete chloroplast genomes of F. sylvatica and perform comparative genomic analyses of the new sequences with the two recently published chloroplast genomes: Bhaga (MW531753) and Jamy (MW537046) [26]. The aim of our study was to identify potentially highly variable markers in chloroplast genome of F. sylvatica suitable for phylogeographic studies as a useful genetic resource for developing chloroplast-based genetic markers (SNPs and SSRs) for large-scale population studies.

2. Materials and Methods

2.1. DNA Isolation and Sequencing

Details regarding DNA isolation and sequencing of Bhaga and Jamy individuals are given in Mishra and coauthors [26]. The remaining 16 individuals representing a wide range of the species distribution were collected from Siemianice provenance trial [28]. Detailed geographic locations are presented in Table 1 and Figure S1. DNA was isolated from leaves with a GeneMATRIX Plant & Fungi DNA Purification Kit (EURx, Poland), after storing them in the dark for 48 h. Genomic library preparation and sequencing was done by an external service provider (IGA Technology Services s.r.l.) with Illumina HiSeq 2500 device in 125-bp PE mode. The obtained reads were purified from adapters and trimmed with Trimmomatic [29]. Raw reads were deposited in SRA under the accession numbers listed in Table 1.

2.2. Chloroplast Genome Assemblies and Annotation

Methods describing assembly and annotation of chloroplast genomes of Bhaga and Jamy individuals are presented in Mishra and coauthors [26]. All the remaining 16 chloroplast genomes where generated using the same protocol: Illumina reads were used for de novo assembly using NOVOPlasty v 4.2. [30,31] with seed sequence NCBI: AY453092.1 [32] and Bhaga chloroplast genome for guiding the program in both inverse repeat regions. After assembling, the sequences were manually checked, in case of presence of ambiguous nucleotides manual curation was done with the assistance of reads mapped to a genome with bwa-mem [33] and visualization of the results in Tablet software [34]. Coding sequences and RNA elements annotation was done with GeSeq ChloroBox [35] using chloroplast genomes of F. crenata (NC_041252; [36]), F. engleriana (KX852398; [37]) F. japonica (MT762294; [38]) and F. sylvatica (NC_041437; [39]) as references. Postprocessed annotated genomes where uploaded to GenBank, for accession numbers see Table 1.

2.3. Assessment of Genome Variation

REPuter was employed to identify four types of large repeating sequences (reverse, forward, complement and palindromic) with a minimum repeat size of 30 bp, hamming distance equal to 3 maximum computed repeats set to 50 [40]. Identification of chloroplast simple sequence repeats (cpSSR) was done using MISA [41]. The minimum number of repeat units was set to eight, six, five, five, three, and three, for mononucleotides, dinucleotides, trinucleotides, tetranucleotides, pentanucleotides, and hexanucleotides, respectively. For assessment of variance between the 18 studied chloroplast genomes alignments were done with MAFFT v 7.450 [42], as implemented in Unipro UGENE [43]. After this variations among the genomes where highlighted using the Bhaga chloroplast genome as reference.

2.4. Detection of Heteroplasmy

Potential chloroplast genome variation within individuals (heteroplasmy) was assessed with mapping of reads of each individual with bwa [33] to extracted SSC, LSC and IR regions of the genomes. The marker calling was done with Freebayes [44] with 0.02 minor allele frequency (MAF) and depth of 200x thresholds for variant detection to avoid sequencing error [45]. To verify the origin of the markers reads were mapped to chromosome 10 of the Fagus sylvatica nuclear assembly [46].

2.5. Phylogenetic Analysis

Phylogenetic analysis was done with on the basis of the dataset of the 18 Fagus sylvatica assemblies to which the complete chloroplast genomes of F. crenata (NC_041252; [36]), F. japonica (MT762294; [38]) and F. engleriana (KX852398; [37]) before alignment as described above. Phylogenetic reconstructions were done using IQ-TREE with the GTR+I+R model [47,48,49] and 1000 bootstrap replicates. The resulting phylogram was edited with Figtree 1.4.3 (http://tree.bio.ed.ac.uk/software/figtree/, accessed on 21 March 2021) with rooting to midpoint and proportional transformation of branches.
The phylogenetic distance matrix obtained from IQ-TREE was also used to test the relationship between genetic and geographic distances among individuals. The correlation was calculated with PASSaGE 2 [50] using Mantel’s test [51] for a global assessment and Mantel’s correlogram to search for significance within 10 equally paired distance classes (the largest class excluded). All tests were performed with 10,000 permutations.

3. Results

3.1. Assembly Size Variance and Genome Annotation

Read coverage of each of 16 new assemblies varied from 86x to 625x with average value of 302x and 284x median. Chloroplast genome structure was stable throughout the studied sequences, and assembly sizes varied from 158,391 bp to 158,464 bp, with highest length variance observed in the large single copy region (LSC) (87,634–87,706 bp). The small single copy regions (SSC) differed only by 4 bp (19,010–19,013 bp), while both inverted repeat regions (IR-A/IR-B) where identical in length (25,873 bp) (Table 2).
Similarly to the previously published F. sylvatica chloroplast assemblies [26,39] each of the 16 new genomes had an identical set of 131 annotated elements: 83 protein coding genes, 8 rRNAs and 40 tRNAs. Total share of coding elements differed across main elements of the genome, with 51% in LSC, 72.1% in SSC and 59.1% in IR regions.

3.2. Repeat elements and SNPs

Large repeating sequence (LRS) assessment showed that sixteen genomes had 32 LRSs >30 bp: 16 forward, 13 reverse, one complement and two palindromic. Two assemblies (Gdańsk and Glorup) had 31 LRSs as a result of the loss of one palindromic match, and one chloroplast genome (Ehingen) had 33 LRSs with an additional reverse match compared to the 16 previously mentioned assemblies. Analysis of cpSSR using MISA detected a total number of 138 markers, out of which only 4 of 97 mononucleotide cpSSRs and 8 of 35 complex cpSSRs were polymorphic. All discovered dinucleotide and pentanuclotide cpSSRs were found to be monomorphic (Table 3).
In the SSC region we found four polymorphic markers: two mononucleotide and two complex repeats. Variation of the SSR mononucleotide T at position 12,538 occurred due to the absence of a microsatellite in one of the individuals (Fantanelle). The LSC region had eight polymorphic markers with six complex and two mononucleotide cpSSR (Table 4). Marker ratio, reflecting the number of individuals associated with a particular variant, showed that in eight sites an alternative nucleotide was present, while three cpSSR loci had two, and one locus had five variants.
Alignment of the 18 chloroplast genomes revealed four SNPs and one indel located in noncoding regions. The first SNP (pos. 12,587) was associated with a mononucleotide cpSSR, the alternative variant was present in an individual from Fantanele, shortening the repetitive sequence to 7 bp. Variants for second (pos. 46,985) and fourth (pos. 112,198) SNP, as well as the indel (pos. 80,558) where present in only one individual; however, in the third SNP (pos. 71,204) 50% of individuals had the alternative variant (Table 5).

3.3. Within Individual Polymorphisms

Within individual polymorphism related to chloroplast genome was found in all 16 tested individuals. After filtering with and minimum depth >200x and MAF of 0,02 a total number of 9028 markers where detected in all analyzed regions.
The average depth of each base at a variant position was lower in single copy elements 349x and 360x in LSC and SSC respectively, while both IR regions had 477x depth (Table 6). However, the average alternative variant’s depth was very similar across the main genome regions from 16.1x in SSC, 18.4–18.5x in IRs and 18.7x in LSC. This suggests that these variants represent the nuclear encoded plastome sequences (NUPTs), as the these values are similar to the average coverage of 17x found on chromosome 10 of the complete nuclear genome. This chromosome was selected for comparative analysis due to lowest NUPT detection in the whole assembly.
Among the variant positions majority of them where SNPs (76.8–84.1%), the remaining share was associated to indels (8.8–10.2%), complex markers (3.1–8.2%) and MNPs (0.2–0.7%). In 2.7–4.6% of sites a mix of variants was detected e.g., a SNP and an indel called at a specific position in the same individual.
Markers detected in this study where found both in coding (48.6–67.3%) and non-coding areas (32.7–51.4%). The size of the contribution in each of these parts was related to the size of coding and non-coding regions of the main genome element (Figure 1).

3.4. Phylogenetic Analysis

The complete chloroplast genome sequences of 18 F. sylvatica individuals, as well as F. crenata, F. japonica and F. engleriana, were used to for a phylogenetic reconstruction based on maximum likelihood method and 1000 bootstrap replicates (Figure 2).
While a global Mantel’s test did not reveal significant relationship (r = 0.2190; p = 0.1861) between the phylogenetic and geographic distances for the 18 individuals, clustering of similar individuals was confirmed with Mantel’s correlogram within the first distance class (<250 km; r = 0.286; p = 0.011) (Table 7).

4. Discussion

Complete chloroplast genomes have helped to reveal species relationships [9,11,12], but also allow to measure divergence within populations [16,17]. Growing genomic resources for European beech provide tools to extend our knowledge on this critically important forest tree species. Our results support previous hypotheses suggesting low genetic diversity of the beech plastome [23,52,53].
The total genome length variation (158,428 ± 37 bp) and the presence of polymorphisms were associated exclusively with Single Copy (SC) dions, while the pairs of IR regions where monomorphic both in terms of the length (25,873 bp) and in terms of nucleotide sequence. Stability of IR and variability of SC regions was also present in sequences of F.japonica (MT762294; MT762295), the results suggest a powerful gene conversion mechanism in Fagus species. Our study revealed 138 cpSSR in F. sylvatica out of which 126 where monomorphic. This group included universal cpSSRs: ccmp4, ccmp7, ccmp10, commonly used to assess phylogeny and relationship in eudicot species [54,55]. Magri and coauthors [23] using these markers concluded that Central Europe beech populations generally can be considered as a homogeneous group. The 12 polymorphic microsatellite markers discovered in this study, when applied for a higher number of individuals and populations, could, however, potentially provide a more detailed phylogeographic picture. Our phylogeographic analysis supports this assumption due to significant clustering of individuals over a relatively short distance <250 km.
Additional source of variation was found in in SNPs (4) and indel (1), all located in noncoding regions, but their position is not in line with results obtained based on reduced representation genomic libraries presented by Meger and coauthors [27]. Heteroplasmy is well reported in plants with known biparental inheritance of chloroplasts, even though in some species (e.g., Passiflora) it can occur at the seedling and embryo but not at the mature developmental stages [56]. In beech, due to maternal inheritance of organelles [3], heteroplasmy can exclusively be caused by mutations. The evidence of multiple integrations of organelle DNA integration with the nuclear genome in beech [46] and detection of within-individual polymorphisms of cpDNA-related sequences presented in this study suggest that assessing beech diversity with chloroplast related SNPs due to a large occurrence of nuclear encoded of plastid DNA (NUPTs) can lead to uncertain results and should be taken with caution [57].

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/genes12091357/s1, Figure S1: Distribution map of the individuals origin sites and the Fagus sylvatica species range.

Author Contributions

Conceptualization, B.U., J.B.; methodology, B.U., J.M.; investigation, B.U., J.M.; data curation, B.U., B.M.; writing—original draft preparation, B.U., J.M.; writing—review and editing, B.U., J.M., B.M., M.T., J.B.; supervision, M.T., J.B.; project administration, M.T., J.B.; funding acquisition, M.T., J.B. All authors have read and agreed to the published version of the manuscript.

Funding

The study was supported by the National Science Center, Poland (2012/04/A/NZ9/00500), the Polish Ministry of Science and Higher Education under the program “Regional Initiative of Excellence” in 2019–2022 (Grant No. 008/RID/2018/19), and the German Science Foundation (Grant No. Thi1362/18-1). Additional support by the LOEWE initiative in the framework of the Centre for Translational Biodiversity Genomics (TBG) is gratefully acknowledged.

Data Availability Statement

The chloroplast genome sequences have been deposited in GenBank under the accession numbers: MW566769, MW566771, MW566772, MW566770, MW566774, MW566776, MW566773, MW566778, MW566784, MW566775, MW566783, MW566777, MW566779, MW566782, MW566780, MW566781.

Acknowledgments

We would like to thank Władysław Barzdajn from Poznań University of Life Sciences and the staff of Experimental Forest District in Siemianice. We also to thank our lab team members: Ewa Sztupecka and Katarzyna Meyza for their outstanding job in fieldwork and DNA isolations. authors gratefully acknowledge the permission of the office of the National Park Kellerwald-Edersse to sample the reference individual Bhaga.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Douglas, S.E. Plastid evolution: Origins, diversity, trends. Curr. Opin. Genet. Dev. 1998, 8, 655–661. [Google Scholar] [CrossRef]
  2. Wu, Z.Q.; Ge, S. The phylogeny of the BEP clade in grasses revisited: Evidence from the whole-genome sequences of chloroplasts. Mol. Phylogenet. Evol. 2012, 62, 573–578. [Google Scholar] [CrossRef]
  3. Birky, C.W., Jr. The inheritance of genes in mitochondria and chloroplasts: Laws, mechanisms, and models. Ann. Rev. Genet. 2001, 35, 125–148. [Google Scholar] [CrossRef] [PubMed]
  4. Wicke, S.; Schneeweiss, G.M.; de Pamphilis, C.W.; Muller, K.F.; Quandt, D. The evolution of the plastid chromosome in land plants: Gene content, gene order, gene function. Plant Mol. Biol. 2011, 76, 273–297. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Straub, S.C.; Parks, M.; Weitemier, K.; Fishbein, M.; Cronn, R.C.; Liston, A. Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics. Am. J. Bot. 2012, 99, 349–364. [Google Scholar] [CrossRef] [Green Version]
  6. Bock, D.G.; Andrew, R.L.; Rieseberg, L.H. On the adaptive value of cytoplasmic genomes in plants. Mol. Ecol. 2014, 23, 4899–4911. [Google Scholar] [CrossRef] [PubMed]
  7. Xia, X.; Yu, X.; Fu, Q.; Zheng, Y.; Zhang, C. Complete chloroplast genome sequence of the three-flowered maple, Acer triflorum (Sapindaceae). Mitochondrial DNA Part B 2020, 5, 1859–1860. [Google Scholar] [CrossRef] [Green Version]
  8. Xue, S.; Shi, T.; Luo, W.; Ni, X.; Iqbal, S.; Ni, Z.; Huang, X.; Yao, D.; Shen, Z.; Gao, Z. Comparative analysis of the complete chloroplast genome among Prunus mume, P. armeniaca, and P. salicina. Hortic. Res. 2019, 6, 89. [Google Scholar] [CrossRef] [Green Version]
  9. Zong, D.; Gan, P.; Zhou, A.; Li, J.; Xie, Z.; Duan, A.; He, C. Comparative analysis of the complete chloroplast genomes of seven Populus species: Insights into alternative female parents of Populus tomentosa. PLoS ONE 2019, 14, e0218455. [Google Scholar] [CrossRef]
  10. Du, S. The complete chloroplast genome sequence of Populus wilsonii based on landscape design, and a comparative analysis with other Populus species. Mitochondrial DNA B Resour. 2020, 5, 2716–2718. [Google Scholar] [CrossRef]
  11. Yang, Y.; Zhou, T.; Duan, D.; Yang, J.; Feng, L.; Zhao, G. Comparative Analysis of the Complete Chloroplast Genomes of Five Quercus Species. Front. Plant. Sci. 2016, 7, 959. [Google Scholar] [CrossRef] [Green Version]
  12. Liu, X.; Chang, E.; Liu, J.; Jiang, Z. Comparative analysis of the complete chloroplast genomes of six white oaks with high ecological amplitude in China. J. For. Res. 2021. [Google Scholar] [CrossRef]
  13. Asaf, S.; Khan, A.L.; Khan, M.A.; Shahzad, R.; Lubna; Kang, S.M.; Al-Harrasi, A.; Al-Rawahi, A.; Lee, I.J. Complete chloroplast genome sequence and comparative analysis of loblolly pine (Pinus taeda L.) with related species. PLoS ONE 2018, 13, e0192966. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Sabir, J.S.; Arasappan, D.; Bahieldin, A.; Abo-Aba, S.; Bafeel, S.; Zari, T.A.; Edris, S.; Shokry, A.M.; Gadalla, N.O.; Ramadan, A.M.; et al. Whole mitochondrial and plastid genome SNP analysis of nine date palm cultivars reveals plastid heteroplasmy and close phylogenetic relationships among cultivars. PLoS ONE 2014, 9, e94158. [Google Scholar] [CrossRef] [PubMed]
  15. Li, F.W.; Harkess, A. A guide to sequence your favorite plant genomes. Appl. Plant Sci. 2018, 6, e1030. [Google Scholar] [CrossRef] [PubMed]
  16. Bondar, E.I.; Putintseva, Y.A.; Oreshkova, N.V.; Krutovsky, K.V. Siberian larch (Larix sibirica L.) chloroplast genome and development of polymorphic chloroplast markers. BMC Bioinform. 2019, 20, 38. [Google Scholar] [CrossRef] [PubMed]
  17. Chen, S.; Ishizuka, W.; Hara, T.; Goto, S. Complete Chloroplast Genome of Japanese Larch (Larix kaempferi): Insights into Intraspecific Variation with an Isolated Northern Limit Population. Forests 2020, 11, 884. [Google Scholar] [CrossRef]
  18. Frey, J.E. Genetic flexibility of plant chloroplasts. Nature 1999, 398, 115–116. [Google Scholar] [CrossRef]
  19. Packham, J.R.; Thomas, P.A.; Atkinson, M.D.; Degen, T. Biological Flora of the British Isles: Fagus sylvatica. J. Ecol. 2012, 100, 1557–1608. [Google Scholar] [CrossRef]
  20. Demesure, B.; Comps, B.; Petit, R.J. Chloroplast DNA phylogeography of the common beech (Fagus sylvatica L.) in Europe. Evolution 1996, 50, 2515–2520. [Google Scholar] [CrossRef]
  21. Sebastiani, F.; Carnevale, S.; Vendramin, G.G. A new set of mono- and dinucleotide chloroplast microsatellites in Fagaceae. Mol. Ecol. Notes 2004, 4, 259–261. [Google Scholar] [CrossRef]
  22. Vettori, C.; Vendramin, G.G.; Anzidei, M.; Pastorelli, R.; Paffetti, D.; Giannini, R. Geographic distribution of chloroplast variation in Italian populations of beech (Fagus sylvatica L.). Theor. Appl. Genet. 2004, 109, 1–9. [Google Scholar] [CrossRef]
  23. Magri, D.; Vendramin, G.G.; Comps, B.; Dupanloup, I.; Geburek, T.; Gomory, D.; Latalowa, M.; Litt, T.; Paule, L.; Roure, J.M.; et al. A new scenario for the quaternary history of European beech populations: Palaeobotanical evidence and genetic consequences. New Phytol. 2006, 171, 199–221. [Google Scholar] [CrossRef] [PubMed]
  24. Hatziskakis, S.; Papageorgiou, A.C.; Gailing, O.; Finkeldey, R. High chloroplast haplotype diversity in Greek populations of beech (Fagus sylvatica L.). Plant Biol. 2009, 11, 425–433. [Google Scholar] [CrossRef]
  25. Papageorgiou, A.C.; Tsiripidis, I.; Mouratidis, T.; Hatziskakis, S.; Gailing, O.; Eliades, N.G.H.; Vidalis, A.; Drouzas, A.D.; Finkeldey, R. Complex fine-scale phylogeographical patterns in a putative refugial region for Fagus sylvatica (Fagaceae). Bot. J. Linn. Soc. 2014, 174, 516–528. [Google Scholar] [CrossRef] [Green Version]
  26. Mishra, B.; Ulaszewski, B.; Ploch, S.; Burczyk, J.; Thines, M. A Circular Chloroplast Genome of Fagus sylvatica Reveals High Conservation between Two Individuals from Germany and One Individual from Poland and an Alternate Direction of the Small Single-Copy Region. Forests 2021, 12, 180. [Google Scholar] [CrossRef]
  27. Meger, J.; Ulaszewski, B.; Vendramin, G.G.; Burczyk, J. Using reduced representation libraries sequencing methods to identify cpDNA polymorphisms in European beech (Fagus sylvatica L). Tree Genet. Genomes 2019, 15, 7. [Google Scholar] [CrossRef]
  28. Barzdajn, W.; Rzeznik, Z. Wstepne wyniki miedzynarodowego doswiadczenia proweniencyjnego z bukiem (Fagus sylvatica L.) serii 1993/1995 w Lesnym Zakladzie Doswiadczalnym Siemnianice. Sylwan 2002, 146, 149–164. [Google Scholar]
  29. Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef] [Green Version]
  30. Dierckxsens, N.; Mardulyn, P.; Smits, G. NOVOPlasty: De novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2017, 45, e18. [Google Scholar] [CrossRef] [Green Version]
  31. Dierckxsens, N.; Mardulyn, P.; Smits, G. Unraveling heteroplasmy patterns with NOVOPlasty. NAR Genom. Bioinform. 2020, 2, lqz011. [Google Scholar] [CrossRef] [Green Version]
  32. Manos, P.S.; Stanford, A.M. The historical biogeography of Fagaceae: Tracking the tertiary history of temperate and subtropical forests of the Northern Hemisphere. Int. J. Plant Sci. 2001, 162, S77–S93. [Google Scholar] [CrossRef]
  33. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013, arXiv:1303.3997v1. [Google Scholar]
  34. Milne, I.; Bayer, M.; Cardle, L.; Shaw, P.; Stephen, G.; Wright, F.; Marshall, D. Tablet—Next generation sequence assembly visualization. Bioinformatics 2010, 26, 401–402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Tillich, M.; Lehwark, P.; Pellizzer, T.; Ulbricht-Jones, E.S.; Fischer, A.; Bock, R.; Greiner, S. GeSeq—Versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017, 45, W6–W11. [Google Scholar] [CrossRef] [PubMed]
  36. Worth, J.R.P.; Liu, L.; Wei, F.J.; Tomaru, N. The complete chloroplast genome of Fagus crenata (subgenus Fagus) and comparison with F. engleriana (subgenus Engleriana). PeerJ 2019, 7, e7026. [Google Scholar] [CrossRef]
  37. Yang, Y.; Zhu, J.; Feng, L.; Zhou, T.; Bai, G.; Yang, J.; Zhao, G. Plastid Genome Comparative and Phylogenetic Analyses of the Key Genera in Fagaceae: Highlighting the Effect of Codon Composition Bias in Phylogenetic Inference. Front. Plant Sci. 2018, 9, 82. [Google Scholar] [CrossRef] [PubMed]
  38. Yang, J.; Takayama, K.; Youn, J.S.; Pak, J.H.; Kim, S.C. Plastome Characterization and Phylogenomics of East Asian Beeches with a Special Emphasis on Fagus multinervis on Ulleung Island, Korea. Genes 2020, 11, 1338. [Google Scholar] [CrossRef]
  39. Mader, M.; Schroeder, H.; Schott, T.; Schoning-Stierand, K.; Leite Montalvao, A.P.; Liesebach, H.; Liesebach, M.; Fussi, B.; Kersten, B. Mitochondrial Genome of Fagus sylvatica L. as a Source for Taxonomic Marker Development in the Fagales. Plants 2020, 9, 1274. [Google Scholar] [CrossRef] [PubMed]
  40. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  41. Beier, S.; Thiel, T.; Munch, T.; Scholz, U.; Mascher, M. MISA-web: A web server for microsatellite prediction. Bioinformatics 2017, 33, 2583–2585. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  42. Katoh, K.; Standley, D.M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 2013, 30, 772–780. [Google Scholar] [CrossRef] [Green Version]
  43. Okonechnikov, K.; Golosova, O.; Fursov, M.; Team, U. Unipro UGENE: A unified bioinformatics toolkit. Bioinformatics 2012, 28, 1166–1167. [Google Scholar] [CrossRef] [Green Version]
  44. Garrison, E.; Marth, G. Haplotype-based variant detection from short-read sequencing. arXiv 2012, arXiv:1207.3907. [Google Scholar]
  45. Schirmer, M.; D’Amore, R.; Ijaz, U.Z.; Hall, N.; Quince, C. Illumina error profiles: Resolving fine-scale variation in metagenomic sequencing data. BMC Bioinform. 2016, 17, 125. [Google Scholar] [CrossRef] [Green Version]
  46. Mishra, B.; Ulaszewski, B.; Meger, J.; Pfenninger, M.; Gupta, D.K.; Wötzel, S.; Ploch, S.; Burczyk, J.; Thines, M. A chromosome-level genome assembly of the European Bee (Fagus sylvatica L) reveals anomalies for organelle DNA integration, repeat content and distribution of SNPs. bioRxiv 2021. [Google Scholar] [CrossRef]
  47. Tavaré, S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lect. Math. Life Sci. 1986, 17, 57–86. [Google Scholar]
  48. Soubrier, J.; Steel, M.; Lee, M.S.; Der Sarkissian, C.; Guindon, S.; Ho, S.Y.; Cooper, A. The influence of rate heterogeneity among sites on the time dependence of molecular rates. Mol. Biol. Evol. 2012, 29, 3345–3358. [Google Scholar] [CrossRef] [Green Version]
  49. Yang, Z. A space-time process model for the evolution of DNA sequences. Genetics 1995, 139, 993–1005. [Google Scholar] [CrossRef] [PubMed]
  50. Rosenberg, M.S.; Anderson, C.D. PASSaGE: Pattern Analysis, Spatial Statistics and Geographic Exegesis; Version 2. Methods Ecol. Evol. 2011, 2, 229–232. [Google Scholar] [CrossRef]
  51. Mantel, N. The detection of disease clustering and a generalized regression approach. Cancer Res. 1967, 27, 209–220. [Google Scholar]
  52. Magri, D. Patterns of post-glacial spread and the extent of glacial refugia of European beech (Fagus sylvatica). J. Biogeogr. 2008, 35, 450–463. [Google Scholar] [CrossRef]
  53. Sjölund, M.J.; González-Díaz, P.; Moreno-Villena, J.J.; Jump, A.S. Understanding the legacy of widespread population translocations on the post-glacial genetic structure of the European beech, Fagus sylvatica L. J. Biogeogr. 2017, 44, 2475–2487. [Google Scholar] [CrossRef] [Green Version]
  54. Heuertz, M.; Fineschi, S.; Anzidei, M.; Pastorelli, R.; Salvini, D.; Paule, L.; Frascaria-Lacoste, N.; Hardy, O.J.; Vekemans, X.; Vendramin, G.G. Chloroplast DNA variation and postglacial recolonization of common ash (Fraxinus excelsior L.) in Europe. Mol. Ecol. 2004, 13, 3437–3452. [Google Scholar] [CrossRef]
  55. Weising, K.; Gardner, R.C. A set of conserved PCR primers for the analysis of simple sequence repeat polymorphisms in chloroplast genomes of dicotyledonous angiosperms. Genome 1999, 42, 9–19. [Google Scholar] [CrossRef] [PubMed]
  56. Shrestha, B.; Gilbert, L.E.; Ruhlman, T.A.; Jansen, R.K. Clade-Specific Plastid Inheritance Patterns Including Frequent Biparental Inheritance in Passiflora Interspecific Crosses. Int. J. Mol. Sci. 2021, 22, 2278. [Google Scholar] [CrossRef] [PubMed]
  57. Scarcelli, N.; Mariac, C.; Couvreur, T.L.; Faye, A.; Richard, D.; Sabot, F.; Berthouly-Salazar, C.; Vigouroux, Y. Intra-individual polymorphism in chloroplasts from NGS data: Where does it come from and how to handle it? Mol. Ecol. Resour. 2016, 16, 434–445. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Share or markers in coding and non-coding regions in relation to the size share of coding and non-coding elements in main genome regions: LSC—large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
Figure 1. Share or markers in coding and non-coding regions in relation to the size share of coding and non-coding elements in main genome regions: LSC—large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
Genes 12 01357 g001
Figure 2. Phylogenetic relationships among 18 F. sylvatica individuals, as inferred using Maximum Likelihood, with F. crenata, F. japonica, and F. engleriana as outgroup. Numbers on nodes indicate percentages of bootstrap support from 1000 bootstrap replicates, the genetic distance between the F.sylvatica individuals and the outgroup was shorten by 0.04.
Figure 2. Phylogenetic relationships among 18 F. sylvatica individuals, as inferred using Maximum Likelihood, with F. crenata, F. japonica, and F. engleriana as outgroup. Numbers on nodes indicate percentages of bootstrap support from 1000 bootstrap replicates, the genetic distance between the F.sylvatica individuals and the outgroup was shorten by 0.04.
Genes 12 01357 g002
Table 1. Origin of sampled individuals and sequencing data volume.
Table 1. Origin of sampled individuals and sequencing data volume.
No.Origin or Individual NameCountryLongitudeLatitudeNumber of Read PairsNCBI Accession NumberSRA Accession Number
1BhagaGermany51.169167 N8.963056 E[26]MW531753N/A
2JamyPoland53.586019 N18.935019 E[26]MW537046SAMN08948264
3GdańskPoland54.383262 N18.516724 E3,777,769MW566769SAMN18917950
4Foret des ColettesFrance46.183328 N2.949992 E4,899,373MW566771SAMN18917951
5LimitacionesSpain42.818059 N2.249663 W6,210,877MW566772SAMN18917952
6GlorupDenmark55.184748 N10.681238 E20,891,953MW566770SAMN18917953
7ŁopuchówkoPoland52.583300 N17.083339 E5,114,816MW566774SAMN18917954
8HasbruchGermany53.120708 N8.4302740 E4,650,347MW566776SAMN18917955
9Bieszczady NPPoland49.117093 N22.579103 E3,046,013MW566773SAMN18917956
10EisenachGermany50.087605 N10.106152 E4,461,792MW566778SAMN18917957
11MorbachGermany50.740891 N6.980116 E5,833,195MW566784SAMN18917958
12EhingenGermany48.399106 N9.500861 E5,632,928MW566775SAMN18917959
13VenetoItaly46.133489 N12.216683 E7,741,036MW566783SAMN18917960
14Cesky KrumlovCzechia48.850035 N14.250406 E7,853,097MW566777SAMN18917961
15BrzezinyPoland51.836489 N19.601247 E7,349,714MW566779SAMN18917962
16SmoleniceSlovakia48.485171 N17.372687 E5,072,400MW566782SAMN18917963
17FantaneleRomania46.416750 N26.466475 E6,584,825MW566780SAMN18917964
18FlämingGermany52.133389 N12.583406 E7,423,489MW566781SAMN18917965
Table 2. Statistics for main chloroplast genome elements: LSC - large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
Table 2. Statistics for main chloroplast genome elements: LSC - large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
Main Genome Elements
Origin or Individual NameRead Coverage NCBI Accession NumberTotal Size (bp)LSC
(bp)
SSC
(bp)
IR-A/IR-B)
(bp)
Bhaga-MW531753158,45887,70219,01025,873
Jamy-MW537046158,46287,70519,01125,873
Gdańsk253xMW566769158,45687,69919,01125,873
Colettes498xMW566771158,39187,63419,01125,873
Limitaciones491xMW566772158,46187,70419,01125,873
Glorup356xMW566770158,46187,70419,01125,873
Łopuchówko212xMW566774158,46187,70419,01125,873
Hasbruch267xMW566776158,46287,70519,01125,873
Bieszczady NP211xMW566773158,42687,66919,01125,873
Eisenach105xMW566778158,45687,69919,01125,873
Morbach350xMW566784158,46387,70619,01125,873
Ehingen91xMW566775158,44687,68919,01125,873
Veneto625xMW566783158,46387,70619,01125,873
Cesky Krumlov300xMW566777158,46287,70519,01125,873
Brzeziny521xMW566779158,46287,70519,01125,873
Smolenice86xMW566782158,43087,67419,01025,873
Fantanele157xMW566780158,46287,70519,01125,873
Fläming306xMW566781158,46487,70519,01325,873
Table 3. General characteristics of chloroplast microsatellite markers in 18 F. sylvatica individuals.
Table 3. General characteristics of chloroplast microsatellite markers in 18 F. sylvatica individuals.
MononucleotideDinucleotidePentanucleotideComplexTotal
Monomorphic932427126
Polymorphic4--812
Total972435138
Table 4. Basic information of polymorphic chloroplast microsatellites; marker ratio—number of individuals associated with a particular marker variant; region types: LSC—Large Single Copy; SSC—Small Single Copy.
Table 4. Basic information of polymorphic chloroplast microsatellites; marker ratio—number of individuals associated with a particular marker variant; region types: LSC—Large Single Copy; SSC—Small Single Copy.
No.Starting Position (bp) *TypeRegionMarker Ratio Flanking Annotation
14363ComplexSSC17/1ndhA (exon II) ↔ ndhA (exon I)
28012ComplexSSC16/1/1psaC ↔ ndhD
311,476Mononucleotide (A)SSC17/1trnL ↔ rpl32
412,583Mononucleotide (T)SSC17/0 **rpl32 ↔ ndhF
546,142ComplexLSC16/1/1matK ↔ trnQ
646,952ComplexLSC11/2/2/1/1/1matK ↔ trnQ
750,589Mononucleotide (A)LSC17/1trnG (exon I) ↔ trnG (exon II)
855,923ComplexLSC16/2atpH ↔ atpI
970,097ComplexLSC16/2rpoB ↔ trnC
1092,043Mononucleotide (A)LSC16/2trnG (exon II) ↔ trnG (exon I)
11105,126ComplexLSC12/5/1ycf4 ↔ cemA
12107,580ComplexLSC17/1petA ↔ psbJ
* according to the Bhaga reference; ** marker absent in an individual
Table 5. Summary of the variant sites detected in the 18 chloroplast genomes, region types: LSC—Large Single Copy; SSC—Small Single Copy.
Table 5. Summary of the variant sites detected in the 18 chloroplast genomes, region types: LSC—Large Single Copy; SSC—Small Single Copy.
No.Position (bp) *Marker TypeRegionConsensus AlternativeAreaMarker RatioFlanking Annotation
112,587SNPSSCTCnoncoding17/1rpl32 ↔ ndhF
246,985SNPLSCGAnoncoding17/1tRNA-K ↔tRNA-Q
371,204SNPLSCGTnoncoding9/9 tRNA-C ↔ petN
480,558IndelLSCT-noncoding17/1 psbZ ↔ tRNA-G
5112,198SNPLSCACnoncoding17/1psaJ ↔ rpl3
* the position (bp) is referred to the Bhaga genome
Table 6. Summary statistics of within individual polymorphisms detected in regions of the 16 chloroplast genome assemblies. LSC - large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
Table 6. Summary statistics of within individual polymorphisms detected in regions of the 16 chloroplast genome assemblies. LSC - large single copy region, SSC—small single copy region, IR-A/IR-B—inverted repeat regions A and B.
LSCSSCIR-AIR-B
Avg. variant depth349x360x477x477x
Avg. alternative var. depth18.7x16.1x18.4x18.5x
Number of uniqe positions5348116112571262
SNP76.8%80.9%83.7%84.1%
Indel10.2%8.8%9.6%9.4%
Complex8.2%6.2%3.1%3.1%
MNP0.2%0.3%0.8%0.7%
Mix4.6%3.9%2.7%2.7%
Coding48.6%67.3%62.9%63.1%
Non-coding51.4%32.7%37.1%36.9%
Table 7. Summary of Mantel’s test statistics calculated within consecutive distance classes.
Table 7. Summary of Mantel’s test statistics calculated within consecutive distance classes.
ClassBoundry max (km)Number of PairsMantel rp
1250110.2860.011
2500310.1060.361
3750460.1210.144
4100027−0.0160.760
5125015−0.0040.900
6150011−0.0230.374
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ulaszewski, B.; Meger, J.; Mishra, B.; Thines, M.; Burczyk, J. Complete Chloroplast Genomes of Fagus sylvatica L. Reveal Sequence Conservation in the Inverted Repeat and the Presence of Allelic Variation in NUPTs. Genes 2021, 12, 1357. https://doi.org/10.3390/genes12091357

AMA Style

Ulaszewski B, Meger J, Mishra B, Thines M, Burczyk J. Complete Chloroplast Genomes of Fagus sylvatica L. Reveal Sequence Conservation in the Inverted Repeat and the Presence of Allelic Variation in NUPTs. Genes. 2021; 12(9):1357. https://doi.org/10.3390/genes12091357

Chicago/Turabian Style

Ulaszewski, Bartosz, Joanna Meger, Bagdevi Mishra, Marco Thines, and Jarosław Burczyk. 2021. "Complete Chloroplast Genomes of Fagus sylvatica L. Reveal Sequence Conservation in the Inverted Repeat and the Presence of Allelic Variation in NUPTs" Genes 12, no. 9: 1357. https://doi.org/10.3390/genes12091357

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop