Skim-Sequencing Reveals the Likely Origin of the Enigmatic Endangered Sunflower Helianthus schweinitzii

Resolving the origin of endangered taxa is an essential component of conservation. This information can be used to guide efforts of bolstering genetic diversity, and also enables species recovery and future evolutionary studies. Here, we used low-coverage whole genome sequencing to clarify the origin of Helianthus schweinitzii, an endangered tetraploid sunflower that is endemic to the Piedmont Plateau in the eastern United States. We surveyed four accessions representing four populations of H. schweinitzii and 38 accessions of six purported parental species. Using de novo approaches, we assembled 87,004 bp of the chloroplast genome and 6770 bp of the nuclear 35S rDNA. Phylogenetic reconstructions based on the chloroplast genome revealed no reciprocal monophyly of taxa. In contrast, nuclear rDNA data strongly supported the currently accepted sections of the genus Helianthus. Information from combined cpDNA and rDNA provided evidence that H. schweinitzii is likely an allo-tetraploid that formed as a result of hybridization between the diploids Helianthus giganteus and Helianthus microcephalus.


Introduction
Since the implementation of screens for allozyme variation [1,2] and through recent developments in next generation sequencing [3], molecular data have provided conservation managers and evolutionary biologists with key information for conservation planning. This information has been used, for example, to designate evolutionarily significant units for conservation action by augmenting knowledge on the morphology and ecology of populations [4,5]. As well, molecular data have enabled retrospective monitoring of effective population size and connectivity [6,7], and this has provided key information for translocation efforts, such as the identification of adaptive genetic variation or pathogens [3]. Lastly, genetic marker data have been instrumental in understanding and managing the destructive and constructive consequences of hybridization for declining populations. Destructive giganteus, part of series Gigantei (H. giganteus, 2n = 2x = 34), is morphologically diverse, widespread, and hybridizes with many other species across its range. Helianthus microcephalus, part of series Microcephali (H. microcephalus, 2n = 2x = 34) is distinguishable by its prolific small flower heads and shows a wide ability to hybridize (Table 1). Helianthus atrorubens, of series Atrorubentes (H. atrorubens, 2n = 2x = 34), is tall, has thick crowded basal leaves, and red florets with yellow or purple style branches.
Perenniality, a trait occurring frequently in the Helianthus genus, may provide important clues regarding the parentage of H. schweinitzii. The perennial species within Helianthus show different modes of perennial habit including the formation of rhizomes, tubers, a deep taproot, or even re-growing from crown buds [31]. This variation is present in the potential parents of H. schweinitzii¸which itself has thick rhizomes and tuberous roots, which likely evolved as a response to periodic fires that once characterized its native habitat in the Carolina Piedmont [35]. Helianthus giganteus has large thick woody roots that can appear tuber-like and short rhizomes. Helianthus microcephalus has very fibrous roots, rhizomes and crown buds, but does not form tubers. Helianthus angustifolius, H. floridanus and H. simulans all have very fibrous roots and small slender rhizomes with many crown buds. Helianthus atrorubens contains poorly developed or absent rhizomes and regenerates from crown buds [36]. Thus, based on rhizome and root morphology, H. schweinitzii bears the closest resemblance to H. giganteus.
Genes 2019, 10, x FOR PEER REVIEW 3 of 12 and hybridizes with many other species across its range. Helianthus microcephalus, part of series Microcephali (H. microcephalus, 2n = 2x = 34) is distinguishable by its prolific small flower heads and shows a wide ability to hybridize (Table 1). Helianthus atrorubens, of series Atrorubentes (H. atrorubens, 2n = 2x = 34), is tall, has thick crowded basal leaves, and red florets with yellow or purple style branches. Perenniality, a trait occurring frequently in the Helianthus genus, may provide important clues regarding the parentage of H. schweinitzii. The perennial species within Helianthus show different modes of perennial habit including the formation of rhizomes, tubers, a deep taproot, or even regrowing from crown buds [31]. This variation is present in the potential parents of H. schweinitzii¸ which itself has thick rhizomes and tuberous roots, which likely evolved as a response to periodic fires that once characterized its native habitat in the Carolina Piedmont [35]. Helianthus giganteus has large thick woody roots that can appear tuber-like and short rhizomes. Helianthus microcephalus has very fibrous roots, rhizomes and crown buds, but does not form tubers. Helianthus angustifolius, H. floridanus and H. simulans all have very fibrous roots and small slender rhizomes with many crown buds. Helianthus atrorubens contains poorly developed or absent rhizomes and regenerates from crown buds [36]. Thus, based on rhizome and root morphology, H. schweinitzii bears the closest resemblance to H. giganteus.    Many of the species examined here can be successfully crossed (Table 1). This characteristic has previously been exploited to study the evolutionary relationships of these perennial species [31,36]. However, hybrids often show reduced fertility and tend to not persist in nature [38]. Cytogenetic observations of perennial species show population-dependent pairing during meiosis [39]. While the crossing ability of many of the perennial sunflowers has been tested, there has been limited work exploring the cross-fertility of H. schweinitzii, due to the rarity of the plant. Differences in chromosomal structure between H. schweinitzii and other sunflowers are also not well characterized [30,40]. With these challenges in mind, the objective of this study was to identify the parental species of H. schweinitzii, which may be useful in conservation efforts.

Plant Material, DNA Extraction, and Sequencing
The accessions of potential progenitor species used in this study were obtained from the United States Department of Agriculture (GRIN repository) and were chosen to maximize coverage of the geographical range for each species (Table 2)

Assembly of Organelle and Nuclear DNA Regions
A total of 26,734,453,800 bases of sequence data were generated, averaging 581,183,778 bases per sample, which corresponds to circa 0.05X to 0.15X depth, depending on the sample's genome size. Assembly of the chloroplast and mitochondrial genomes was attempted after first reducing the complexity of each library, to enrich for organelle genome reads. This was achieved by first aligning quality-filtered reads to the H. annuus chloroplast genome (GenBank accession NC007977) and mitochondrial genome (GenBank accession KF815390) using BWA-mem [42]. Reads that aligned to the chloroplast genome were assembled using VELVET [43]. We used a hash length of 21, and a minimum contig length of 100 bp. We also set a coverage cut-off of 10 reads. The resulting contigs were then ordered based on alignments to the corresponding organellar genome of H. annuus, and were merged using Geneious [44]. Regions not covered by Illumina reads, which led to low quality assemblies, were removed, leaving only the high-quality regions for analysis. Because few mitochondrial final assemblies were created due to low numbers of overlapping contigs, we discarded the mitochondrial genome from analyses and focused on the chloroplast genome for organelle DNA information.
The nuclear 35S rDNA regions were assembled using a similar procedure. We relied on quality-filtered reads and used the VELVET de novo assembler [43], with the same parameters as used for the chloroplast DNA. Contigs for 35S rDNA were identified based on alignments to the corresponding H. annuus 35S reference (GenBank accession KF767534). A limitation of this is that only the most common SNP present in each individual is used. Additionally, 35S rDNA and chloroplast assemblies for six H. giganteus samples were obtained from Bock et al. (2014) [21]. These were generated using the same assembly pipeline described above.

Phylogeny Reconstruction
The chloroplast and rDNA sequences were aligned using MAFFT [45] with default settings and were inspected and edited in Geneious by filtering low quality sequence [44]. Maximum likelihood (ML) trees were generated using PhyML [46] implemented in Geneious [44], with branch support estimated using the Shimodaira-Hasegawa-like (SH-like) procedure. Bayesian inference was conducted with MrBayes [47]. Briefly, the General-time-reversible (GTR) model was used to reconstruct the phylogeny. The Bayesian analysis used four runs, each with four Markov chains initiated from a random tree and run for 1,000,000 generations, which results in an Effective Sample Size of 336 for cpDNA and 936 for rDNA. The first 25% of all trees sampled before convergence were discarded as burn-in. Trees were rooted with H. annuus as the outgroup and reference genome source. For rDNA data, to further investigate the possibility that H. schweinitzii originated via repeated polyploidization events, we surveyed levels of sequence divergence among haplotypes obtained for each species. These analyses were based on the Tamura-Nei distance [48]. Series specific nucleotide variation was further explored in the assembled portion of the 35S rDNA data. Contigs were aligned to the H. annuus rDNA reference and SNPs were called using Geneious. Heterozygous (via overlapping contigs) and tri-allelic SNPs were removed. In total, 260 SNPs were called in the 35S rDNA between 200 bp and 6770 bp where all individuals had a full assembly.

Results
Phylogenetic analyses based on a chloroplast DNA alignment of 87,004 bp did not recover any perennial sunflower species as reciprocally monophyletic. Instead, many groupings tracked geography. These included, for example, accession pairs PI468716 (H.  Table 2). While the branches were not monophyletic, there were some associations between H. angustifolius, H. simulans and H. floridanus of the Angustifolii series. Helianthus schweinitzii accessions repeatedly grouped with H. microcephalus and H. giganteus across analytical methods ( Figure 2). Mitochondrial phylogenies were not informative due to limited coverage across taxa and poor alignments.
Analyses of rDNA sequence divergence revealed comparable levels of diversity for H. schweinitzii and candidate progenitor species (Figure 3). The level of sequence divergence between haplotypes was comparable between the diverse accessions of putative homoploid hybrids, which was higher than putative diploid parents, this agrees with previous expectations. Phylogenies based on the 35S rDNA alignment (6770 bp) revealed that many taxa form monophyletic groups, some with high support (Figure 4). Two major species groups were recovered. The first comprised H. angustifolius, H. floridanus, and H. simulans, with these species being polyphyletic. This is consistent with the idea previously advanced by Timme et al., 2007 [49], that H. simulans may be a homoploid hybrid of H. angustifolius and H. floridanus. The second group, H. giganteus, H. atrorubens, H. microcephalus and H. schweinitzii were recovered, all monophyletic. There was not strong phylogentic support for the association of H. microcephalus with H. schweinitzii despite the reported morphology-based characterization of H. microcephalus as a parental species [31].
The cpDNA tree did not recover monophyletic groups, but H. schweinitzii was consistently associated with H. giganteus and H. microcephalus. The rDNA tree did not identify the same associations as the cpDNA tree. This may be due to the different mode of inheritance of cp (maternal) and rDNA (biparental) which can cause differing tree topologies; this could be due to hybridization (chloroplast capture), insufficient sampling, and variable evolutionary rates.

Discussion
Chloroplast DNA variation can be used to explore species origin and, in the case of hybrid taxa, the direction of hybridization (i.e., the identity of the maternal progenitor). Also, the extent of polymorphism retained at the level of organelle DNA may be used to distinguish between the occurrence of single vs. multiple polyploid speciation events [50,51]. In this study, the chloroplast phylogeny did not recover any perennial Helianthus species as reciprocally monophyletic. This is in line with previous findings in other perennial Helianthus [21] as well as in annual Helianthus taxa [52]. These results can be explained by incomplete lineage sorting (ILS) or by reticulation. ILS, which is caused by retention of ancestral states, results in discordant phylogenetic relationships [53,54] and is likely common in sunflowers due to their recent radiation across North America [25,49,55,56]. In perennial taxa in particular, allelic coalescence may be delayed because these species are fewer generations removed from the speciation event, all else being equal.
The alternative explanation, reticulation, results in systematic associations between species. These associations reflect historical organelle capture events occurring among pairs of taxa that are interfertile [56]. Previous results in annual Helianthus [52] have indicated that, relative to ILS, reticulation is likely more important in generating patterns of cytonuclear discordance such as those observed here. Indeed, we identified several cases of haplotype sharing among geographically proximate populations (Figure 2), which would indicate that hybridization is more likely than ILS.
In the case of polyploid species, instances of limited or no chloroplast DNA variation have previously been interpreted as evidence for the occurrence of a single speciation event [50,51]. Cases where chloroplast DNA variation is extensive or comparable to that observed in candidate progenitors can be explained by two non-mutually exclusive scenarios, repeated polyploid speciation [57] or post-speciation reticulation. In the case of H. schweinitzii, the level of sequence divergence that we inferred among cpDNA haplotypes was similar for all perennial sunflowers. Therefore, because of the likely occurrence of reticulation and chloroplast capture in this system, our ability to infer the number of speciation events for H. schweinitzii is limited. The phylogenetic placement, based on the rDNA data, of H. angustifolius, H. floridanus, and H. simulans, is in agreement with previous taxonomic work. This is supported by the high level of cross fertility among these three species [31,36].
The interpretation of the cpDNA information is complicated because of ongoing ILS and introgression. However, H. schweinitzii shares more cpDNA haplotypes with H. giganteus than any of its other possible parents (Figure 2). The rDNA shows a trichotomy in the Bayesian inference, which includes H. microcephalus and H. atrorubens, while the maximum likelihood tree suggests a closer

Discussion
Chloroplast DNA variation can be used to explore species origin and, in the case of hybrid taxa, the direction of hybridization (i.e., the identity of the maternal progenitor). Also, the extent of polymorphism retained at the level of organelle DNA may be used to distinguish between the occurrence of single vs. multiple polyploid speciation events [50,51]. In this study, the chloroplast phylogeny did not recover any perennial Helianthus species as reciprocally monophyletic. This is in line with previous findings in other perennial Helianthus [21] as well as in annual Helianthus taxa [52]. These results can be explained by incomplete lineage sorting (ILS) or by reticulation. ILS, which is caused by retention of ancestral states, results in discordant phylogenetic relationships [53,54] and is likely common in sunflowers due to their recent radiation across North America [25,49,55,56]. In perennial taxa in particular, allelic coalescence may be delayed because these species are fewer generations removed from the speciation event, all else being equal.
The alternative explanation, reticulation, results in systematic associations between species. These associations reflect historical organelle capture events occurring among pairs of taxa that are interfertile [56]. Previous results in annual Helianthus [52] have indicated that, relative to ILS, reticulation is likely more important in generating patterns of cytonuclear discordance such as those observed here. Indeed, we identified several cases of haplotype sharing among geographically proximate populations (Figure 2), which would indicate that hybridization is more likely than ILS.
In the case of polyploid species, instances of limited or no chloroplast DNA variation have previously been interpreted as evidence for the occurrence of a single speciation event [50,51]. Cases where chloroplast DNA variation is extensive or comparable to that observed in candidate progenitors can be explained by two non-mutually exclusive scenarios, repeated polyploid speciation [57] or post-speciation reticulation. In the case of H. schweinitzii, the level of sequence divergence that we inferred among cpDNA haplotypes was similar for all perennial sunflowers. Therefore, because of the likely occurrence of reticulation and chloroplast capture in this system, our ability to infer the number of speciation events for H. schweinitzii is limited. The phylogenetic placement, based on the rDNA data, of H. angustifolius, H. floridanus, and H. simulans, is in agreement with previous taxonomic work. This is supported by the high level of cross fertility among these three species [31,36].
The interpretation of the cpDNA information is complicated because of ongoing ILS and introgression. However, H. schweinitzii shares more cpDNA haplotypes with H. giganteus than any of its other possible parents (Figure 2). The rDNA shows a trichotomy in the Bayesian inference, which includes H. microcephalus and H. atrorubens, while the maximum likelihood tree suggests a closer relationship with H. atrorubens, H. angustifolius, H. simulans, and H. floridanus, making it difficult to make definitive assessments. Thus, the most parsimonious explanation when considering cpDNA, rDNA, crossing data, and geography is with an allotetraploid origin from H. microcephalus and H. giganteus, as originally hypothesized by Heiser [31]. However, we are unable to fully exclude the possibility of an autopolyploid origin or that an extinct diploid is the progenitor species (or one of the progenitors), similar to the B genome in Triticum [58]. If H. schweinitzii was formed due to a hybrid origin, it is possible that H. giganteus served as maternal parent, while the paternal parent could be the extinct parent, perhaps the common ancestor of H. atrorubens, H. angustifolius, H. simulans, and H. floridanus. Based on the crossing studies, it is possible that bidirectional hybridization events lead to the origin of H. schweinitzii. The hypothesis of an extinct progenitor is also supported by the distinctive sesquiterpene lactone chemistry reported for H. schweinitzii [59] and the finding of Timme et al. [25] that polyploids formed their own clade in a 35S rDNA tree for Helianthus. However, it is important to keep in mind that novel secondary compounds are often generated in hybrids [60] and that concerted evolution among parental rDNA repeats (or the presence of both parental sites) in allopolyploids could create a convergent phylogenetic signal. Testing the hypothesis of an extinct progenitor will require additional genomic data. Another option would be to attempt to re-create H. schweinitzii from hybrids of H. microcephalus and H. giganteus, a previously reported successful cross [31].

Conclusions
The demonstration that Helianthus schweinitzii exhibits significant genetic distinctness from its progenitors heightens the need to conserve this distinctive but threatened species. In addition, the presence of well-formed tubers makes it of additional interest as a potential study system for tuber formation, as well as a possible source of genetic material for improvement of H. tuberosus. The two tuber-forming species of the genus are now proposed to have different sets of ancestors (H. grosseserratus and H. hirsutus for H. tuberosus, [21]; H. giganteus and H. microcephalus for H. schweinitizii, current study), increasing the likelihood that different sets of genes may be involved in tuber formation and chemistry in the two species.