Next Article in Journal
The Role of Anthocyanins in Plant Tolerance to Drought and Salt Stresses
Next Article in Special Issue
Pityopsis ruthii: An Updated Review of Conservation Efforts for an Endangered Plant
Previous Article in Journal
Investigation on Chemical Composition, Antioxidant, Antifungal and Herbicidal Activities of Volatile Constituents from Deverra tortuosa (Desf.)
Previous Article in Special Issue
Population Genetics, Genetic Structure, and Inbreeding of Commiphora gileadensis (L.) C. Chr Inferred from SSR Markers in Some Mountainous Sites of Makkah Province
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Conservation Genetics of Iris lacustris (Dwarf Lake Iris), a Great Lakes Endemic

by
James Isaac Cohen
1,* and
Salomon Turgman-Cohen
2
1
Department of Botany and Plant Ecology, Weber State University, 1415 Edvalson St., Dept. 2504, Ogden, UT 84408-2504, USA
2
E.S. Witchger School of Engineering, Marian University, 3200 Cold Spring Road, Indianapolis, IN 46222-1997, USA
*
Author to whom correspondence should be addressed.
Plants 2023, 12(13), 2557; https://doi.org/10.3390/plants12132557
Submission received: 5 May 2023 / Revised: 26 May 2023 / Accepted: 3 July 2023 / Published: 5 July 2023
(This article belongs to the Special Issue Advances in Plant Reproductive Ecology and Conservation Biology)

Abstract

:
Iris lacustris, a northern Great Lakes endemic, is a rare species known from 165 occurrences across Lakes Michigan and Huron in the United States and Canada. Due to multiple factors, including habitat loss, lack of seed dispersal, patterns of reproduction, and forest succession, the species is threatened. Early population genetic studies using isozymes and allozymes recovered no to limited genetic variation within the species. To better explore genetic variation across the geographic range of I. lacustris and to identify units for conservation, we used tunable Genotyping-by-Sequencing (tGBS) with 171 individuals across 24 populations from Michigan and Wisconsin, and because the species is polyploid, we filtered the single nucleotide polymorphism (SNP) matrices using polyRAD to recognize diploid and tetraploid loci. Based on multiple population genetic approaches, we resolved three to four population clusters that are geographically structured across the range of the species. The species migrated from west to east across its geographic range, and minimal genetic exchange has occurred among populations. Four units for conservation are recognized, but nine adaptive units were identified, providing evidence for local adaptation across the geographic range of the species. Population genetic analyses with all, diploid, and tetraploid loci recovered similar results, which suggests that methods may be robust to variation in ploidy level.

1. Introduction

In 1818, Thomas Nuttall described a new species of crested Iris L., Iris lacustris Nutt., “on the gravelly shores of calcareous islands of Lake Huron” [1]. Since then, the recognized geographic range of the species has expanded to include the northern regions of Lakes Huron and Michigan in the United States and Canada. Presently, the species is known from 165 occurrences, with more than half in Michigan (89) and the others split between Wisconsin (36) and Ontario (40) [2].
Plants of I. lacustris grow less than 15 cm in height [3], and this feature provides the species with its common name, Dwarf Lake Iris. The species bears self-compatible flowers, with purple sepals and purple petals with yellow and white markings, that are visited by various species of bees [4]. Across its geographic range, I. lacustris frequently inhabits the understory of coniferous forests along the shore, although a small number of inland populations are known (Figure 1) [2,5,6]. These habitats have thin entisols, and the dominant tree species primarily include Thuja occidentalis L., Abies balsamea (L.) Miller, and Picea glauca (Moench) Voss. The species has become a well-known endemic plant of the Great Lakes and is so characteristic of the region that it was recognized as the state wildflower of Michigan [7].
In 1988, 170 years after I. lacustris was initially described, the species was listed as federally threatened [5]. The small number of populations and individuals is due to multiple factors, including the loss of shoreline habitat, fungal infection of fruits, lack of seed dispersal, and overgrowth of the forest canopy that restricted plant growth, flower production, and sexual reproduction. Plants of the species currently reproduce more by vegetative growth than germination from the myrmecochorous seeds [5]. Despite this low germination rate, seeds can remain viable in the seedbank for at least 15 years [5], a factor that could influence long-term population growth and genetic diversity, although mass germination and recruitment are rare [4].
The ecology of I. lacustris has been examined to a greater extent than the genetic diversity of the species. To date, only three studies have explored this topic: Simonich and Morgan [8] examined nine populations in Wisconsin, using 22 allozyme markers, Orick [9] investigated nine populations in Michigan, using 24 isozymes, and Hannan and Orick [10] examined nine populations in Michigan, using 18 isozymes. In two studies, researchers identified genetic homogeneity across the populations; however, Orick [9] found overall heterozygosity to be 3.7%. Hannan and Orick [10] also note gene silencing may have been possible in four loci. In contrast to the genetic diversity recognized in I. lacustris, Hannan and Orick [10] found that the sister species, I. cristata Aiton [11], which has a wider geographic range across eastern North America, was variable at 11 of 15 loci. These studies suggest that the genetic diversity of this rare species of Iris is quite limited. This genetic paucity is intriguing because I. lacustris and its sister are both putative tetraploids [10], and polyploid plant species tend to have greater genetic variation than diploid relatives, although selfing tends to be higher in polyploids [12,13,14]. Importantly, the genetic diversity of the I. lacustris may have implications for the ability of the species to respond to the changing environment across its geographic range and for various conservation efforts.
In order to investigate the population and conservation genetics of the species in a comprehensive manner, we examined multiple populations from across Michigan and Wisconsin, and we used tunable Genotyping-by-Sequencing (tGBS [15]), a method of reduced representation sequencing, to identify single nucleotide polymorphisms (SNPs) among the populations. The objectives of the present study are threefold: (1) identify genetic diversity and population structure and substructure across the range of I. lacustris, (2) explore patterns of migration, and (3) recognize population clusters for management of this rare species. Given the paucity of genetic diversity identified in previous studies, we hypothesized that there would be limited genetic variation across the species.

2. Results

2.1. DNA Sequencing and Polyploid Filtering

Among the 171 individuals of 24 populations across the geographic range of I. lacustris in Michigan and Wisconsin (Figure 1, Table 1), 726,786,603 paired-end reads were sequenced, with a mean of 4,225,503 reads per sample. The consensus sequence included 1,335,996 scaffolds with 196,139,854 bp (N50 = 644,994, L50 = 145). The mean per sample alignment and unique alignment to the consensus sequences are 93.9% and 74.4%, respectively. For the MCR90 dataset, 125 reads were interrogated per SNP across 2,341,730 bases, with 4.8% missing data for the final dataset. For the MCR50 dataset, 31 reads were interrogated per SNP across 23,904,409 bases, with 31.4% missing data for the final dataset. The numbers of SNPs in the diploid and tetraploid datasets identified through analysis in polyRAD are in Table 2.

2.2. Population Genomics

Across all datasets, observed heterozygosity slightly exceeds expected heterozygosity, and FIS values are, in general, negative (Table 3). Pairwise FST values vary from 0.1–0.45, and results are similar among datasets (Table 4). Based on various AMOVA results, most of the variation is within samples, followed by between the populations, regardless of the datasets and partitioning of the populations (Supplemental Table S1). Mantel tests for isolation-by-distance analyses identify all datasets as having spatial structure (Supplemental Figure S1) with p < 0.001 for analyses of individuals, but only MCR90 datasets had spatial structure for populations (p < 0.05).
Results from analyses in fastSTRUCTURE, STRUCTURE, MavericK, and tess3r are similar. Based on the results from StructureSelector, the optimal K values were greater for all loci analyzed together than for either the diploid or tetraploid loci analyzed independently (Table 2, Supplemental Table S2). Similar clusters were recovered with the different datasets (Figure 2, Table 1), with a clear division between three groups—eastern, western, and central populations—and multiple analyses resulted in the central population being divided into two distinct groups at K = 4 and/or 5 (Figure 1, Supplemental Figures S2–S4), especially for all loci in fastSTRUCTURE and multiple datasets with STRUCTURE, MavericK, and tess3r. At K = 4–5, the two Wisconsin populations were often recovered with unique genomic signatures suggestive of admixture, and this is particularly the case with the MCR90 datasets. While the results of conStruct are similar to others, the three distinct groups identified are more opaque, with boundaries between the eastern and western populations overlapping to a larger extent than with the other analyses (Supplemental Figure S5); although, similar patterns can be recognized at K = 4 and 5 for the MCR90 all and diploid loci datasets. Among all methods, the three populations on Bois Blanc Island in Michigan (MI5, MI13, and MI22), in the northwestern geographic range of the species, also include some individuals that show signals of admixture between the eastern and central populations (Figure 2). Graphs of K values for all analyses are included in Supplementary Materials (Figures S6–S16).
The results of principal components analysis (PCA) and discriminant analyses of principal components (DAPC) are similar to those that explicitly consider a priori population structure. With PCA, three to four clusters were recovered corresponding to the same ones from the population assignation analyses, and this was more evident with the MCR90 datasets compared to the MCR50 ones. In all analyses, three populations—MI6, MI16, and WI5—were recognized as most distinct from the other populations. Across DAPC analyses, individuals from populations tended to cluster together, and this is similar to results from other methods. In general, DAPC analyses recover MI6, MI16, and WI5 as distinct units or as a cluster together, with the results for MCR50 all loci being the only exception. In analyses with this dataset, WI5 was included in a cluster distinct from the other two populations, but with WI4 and populations from Michigan. In some analyses, such as MCR50 and MCR90 diploid loci, the divided cluster of central populations was identified. The number of loci under selection in each dataset is in Table 2.
Patterns of migration inferred from BA3-SNPs suggest that migration is minimal, regardless of the dataset analyzed, and that most individuals are from their original population (Figure 3). While this was certainly the case for all loci for MCR90, analyses with only the diploid loci for three or four population clusters (Table 1) provide evidence of greater rates of migration between adjacent populations (Figure 3). Migration directly between the eastern and western populations was negligible. The relationship among the four population clusters that was most supported by the results of DIYABC-RF and abcranger varies depending on the dataset analyzed. For all, diploid, and tetraploid loci, (West (Mid1 (Mid2, East))), (West (East (Mid1, Mid2))), and (West (Mid1 (Mid2, East))) are recovered as optimal, respectively, and (Mid2 (West (Mid2, East))) and (West (East (Mid1, Mid2))) are identified as close second choices for all and tetraploid datasets, respectively. The one constant among the three optimal trees is that the western population is recognized to have diverged prior to the mid and eastern populations, and this also is the case for one of the near-optimal trees (Supplemental Figure S17).

2.3. Conservation Units

Based on the method of Funk et al. [16], evolutionarily significant units (ESUs) were identified using all loci, as described below, and the management units (MUs), which are based on fastSTRUCTURE, PCA, and DAPC analyses with loci not under selection, are quite similar. The largest difference between ESUs and MUs is that the two populations in Wisconsin may or may not be included with the other two western populations, MI6 and MI16, depending on the use of all loci or only diploid or tetraploid loci (Figure 4). The populations on Bois Blanc Island also have mixed ancestry based on these loci. The adaptive units, which are based on fastSTRUCTURE, PCA, and DAPC analyses with loci under selection, provide quite different results. Generally, among analyses, nine adaptive units are recognized, and these are structured based on geography (Figure 4, Supplemental Figure S18, Table 1).
The results of the fastSTRUCTURE, PCA, and DAPC are similar, with one exception. Unlike analyses with fastSTRUCTURE and PCA, where individuals of the same population cluster together, with DAPC, some individuals of the same population are members of different clusters. This is likely due to the large number of clusters identified as optimal, which is particularly the case for MCR90 and MCR50 datasets with all loci.

3. Discussion

3.1. Population Structure and Genetic Diversity

Based on the multiple datasets explored using various methodological approaches, three or four different population clusters were frequently recognized for I. lacustris across Michigan and Wisconsin. These clusters are structured geographically, with eastern, central, and western groups, and at higher K values, the central group is subdivided into two groups that are also geographically oriented (Figure 1 and Figure 2). In the three prior studies that employed isozymes and allozymes to examine the population genetics of I. lacustris [8,9,10], no to limited genetic diversity was identified in the populations. Each study only investigated the genetic diversity of populations within one state, using markers available at the time, which likely led to the paucity of genetic diversity. In the present study, many more loci were examined, and individuals from across most of the geographic range of the species were analyzed together, which provides a more holistic approach to elucidating the genetic diversity of the species. These results demonstrate that our hypothesis—a lack of genetic diversity among the species—was incorrect.
Across all studied populations, statistically significant isolation-by-distance is noted, and much of the genetic variation occurs within samples and among populations, with little variation within each population. These results are, on some level, unsurprising for a species that is not only clonal but also includes minimal sexual reproduction. Sampling issues, such as small numbers of individuals studied for some populations and potential collection of ramets, could also have contributed to limited within-population genetic diversity. Additionally, almost all populations have negative FIS values, a finding frequently occurring with clonal plants [17]. A similar result was recovered by Edgeloe et al. [18] for another clonal, polyploid species, Posidonia australis Hook.f. Despite the clonal growth in these polyploid species, the multiple gene copies may provide sufficient genetic diversity and potential so that rare species, such as I. lacustris, do not suffer the negative long-term impacts of vegetative reproduction and inbreeding. The changing climate will certainly be a test as to whether the genetic diversity harbored in each population will be appropriate to adapt to new conditions [19].
Among the identified clusters of populations, there are two notable areas: Bois Blanc Island in the eastern part of the sampled range and the four western populations. In Bois Blanc Island, the populations display mixed ancestry between the eastern and central populations, and these were results recovered with multiple datasets and analyses. This mixed ancestry could occur because of hybridization on the island itself with ancestors from both populations colonizing and interbreeding there. Alternatively, hybridization could have taken place on the mainland of the lower peninsula of Michigan, such as at MI7 or MI8, followed by colonization of the island. While the signature of mixed ancestry identified in the present study may suggest that hybridization is recent, given that the species reproduces clonally, the signature of (older) hybridization could remain for an extended period of time. It is useful to keep in mind that the island and nearby areas on the mainland are some of the more heavily sampled geographic regions in the present study. This greater sampling could hint at a similar pattern in other areas if individuals were sampled to a larger extent. It was not possible to include representatives from Ontario, Canada in the study, and future studies that add these will likely have greater context for the relationship of the central and eastern populations to those even farther east.
The four populations in the western cluster (MI6, MI16, WI4, and WI5) are notable. While these populations form a cluster in most analyses (Figure 2), the two Wisconsin populations (WI4 and 5) differ from those in Michigan, and, in some analyses, from each other. While WI4 and WI5 are geographically close together on the Door Peninsula and tend to cluster together in some analyses, WI4 is sometimes resolved as sharing ancestry with the eastern populations, which is not the case for WI5. This could be due to the retention of ancestral polymorphism or the fact that the establishment of each of these populations differs. However, in analyses that account for both genetic and geographic data (i.e., tess3r and conStruct), both Wisconsin populations are distinct clusters and/or are usually allied with the other western populations. This is particularly the case for the diploid dataset. In another, well-known Great Lakes shoreline endemic, Cirsium pitcheri Torr. & A.Gray, a similar pattern was recovered. The populations from the Door Peninsula are also quite distinct from others on Lake Michigan [20], and the northern populations on the peninsula share more alleles with the populations in the Upper Peninsula of Michigan than with some of the populations on the southern part of the peninsula.
MI6 and MI16 are intriguing populations of I. lacustris because they are situated inland, and this is not the case for the other sampled populations. While other populations can be found a short distance from the shoreline, these populations are ca. 30 km from the current boundary of Lake Michigan. These two populations are consistently recognized as genetically distinct from the other sampled populations, and these both likely became established during higher water level periods of Glacial Lake Algonquin ca. 12,500 years ago [21,22]. As water levels decreased during the time of Glacial Lake Chippewa and subsequently rose to current levels, these two populations became isolated in suitable habitat (e.g., conifer wetland) that allowed individuals of I. lacustris to persist, but without the opportunity to interbreed with other, coastal populations, resulting in their distinct genetic signature (Figure 2).

3.2. Migration and Demography

After deglaciation, I. lacustris migrated eastward from the western part of its range. This pattern provides evidence that MI6 and MI16 became established early in the colonization of the species during times of higher water levels and, therefore, are relicts rather than the result of inland dispersal. Additionally, the central and then eastern populations developed via migration across northern Lakes Michigan and Huron, and these populations may have retained some of the ancestral polymorphisms in the more western populations, such as WI4 and WI5. This west-to-east pattern suggests that the populations in Ontario are the most recently established, a hypothesis that can be tested during a future study. The pattern noted here for I. lacustris differs from that of C. pitcheri, which is hypothesized to have migrated from east to west [20].
Overall, rates of migration, as inferred with BA3-SNPs, among populations are minimal, a result recovered in other species of Iris on the Korean Peninsula [23] and a pattern that is not uncommon for narrow endemics [20]. This minimal migration is the case for all 24 populations studied as well as with three and four population clusters inferred (Figure 3). Although the species presently reproduces within populations, migration occurred and may have provided an infusion of new alleles, even if this was not a common occurrence.
In C. pitcheri, Fant et al. [20] note that the changes in the water level of the Great Lakes shaped the geographic distribution of this endemic species, with lower water levels allowing for increased connection among populations. Lake level changes could also have impacted the geographic distribution of I. lacustris. This is particularly the case for the more inland populations, which could have become established ca. 4500 years ago during the most recent high water levels for the lake. Lower lake levels may have influenced colonization of the islands as well as migration across the northern regions of Lake Michigan and allowed for the exchange of individuals that currently would be more challenging.
An alternative hypothesis for the present geographic distribution of the species also exists. Van Kley and Wujek [6] and Brotske [4] provide evidence that I. lacustris can inhabit a diversity of ecosystems and that changes in patterns of disturbance and forest succession following European colonization of the area reduced the suitable habitat for the species (e.g., more forests with more closed canopies). This has resulted in populations primarily being restricted to shorelines where habitat was appropriate. If this is the case, the inland populations, such as MI6, would still represent relicts of a prior time, but this would be due to remnant habitat availability based on adequate disturbance regimes and/or seral stages, not prior establishment during higher water levels of the Great Lakes and subsequent serendipitous survival.

3.3. Subsetting Diploid and Tetraploid Loci

In the present study, polyRAD [24] was used to create datasets of diploid and tetraploid loci, and these were analyzed alongside a dataset of all loci for the MCR90 and MCR50 datasets. In general, analyses of all six datasets produced fairly similar results (Figure 2, Table 3 and Table 4). fastSTRUCTURE analyses of MCR90 and MCR50 datasets of all loci resulted in the identification of a cluster of six populations in the central part of the sampled population of I. lacustris (MI2, MI3, MI4, MI11, MI12, and MI20) that was not recovered with the diploid or tetraploid datasets, although hints of this cluster can be seen in the MCR90 2N dataset at K = 5. This cluster is identified in all of the datasets with loci under selection as either one or two clusters (Figure 2) and with the MCR90 datasets analyzed with STRUCTURE [25] and MavericK [26].
The similar results among the datasets, regardless of ploidy, may provide some evidence that not disentangling diploid and tetraploid loci from all loci may not lead to spurious results using SNP data for population genomics [27]. This statement should be treated with skepticism because it is based only on one, empirical, study. Others who have used polyRAD to subset their datasets and identify diploid loci to use for population genomics [28,29], which is a practice aligned with assumptions of common methods [28], have not explored the use of all loci and/or tetraploid loci in comparison to only ones that segregate as diploids. It would be useful for additional studies on the population genomics of polyploid species to examine data employing all, diploid, and tetraploid (and higher) loci to determine if similar or divergent results are recovered. At the same time, the results presented herein may provide some level of confidence for researchers investigating the population genomics of species of unknown ploidy that use all loci identified via tGBS, and similar reduced-representation methods may not yield incongruent results.

3.4. Conservation Genetics of I. lacustris

The evolutionarily significant units (ESUs) were described above with all loci used for population genomic analyses, and the management units (MUs), which were determined using only loci not under selection, are similar, but not identical to the ESUs; however, the differences are minor (Figure 4). Given the similar ESUs and MUs, the management of the populations of I. lacustris could be geographically clustered into three to four units. However, the results of the use of the loci under selection to resolve adaptive units (AUs) differ from those of ESUs and MUs (Supplemental Figure S18). The AUs provide evidence of local adaptation, so managing only three or four MUs would not necessarily ensure that all of the genetic diversity of the species is appropriately protected. A total of nine AUs are recognized (Table 1), and while these are also geographically clustered, the AUs are much smaller than are the ESUs and MUs (Figure 4).
This local adaptation is, on some level, unsurprising, because even though the species is generally restricted to the same type of habitat presently (i.e., shorelines), climatic, soil, and vegetation differences occur across the geographic range of the species. Indeed, I. lacustris inhabits three of the landscape ecology regions of Michigan and multiple districts and subdistricts within each region [30,31]. Van Kley and Wujek [6] also recognized four soil types, four vegetation types, and pH variation across the species’ range. Given that the species primarily reproduces asexually, this can lead to a loss of genetic variation over time as a limited number of successful genotypes dominates each particular climate–soil–vegetation combination. Consequently, the seemingly same type of habitat in a geographically distinct area may result in local adaptations to the specific region and ecosystem and contribute to outbreeding depression, limiting successful offspring from infrequent interpopulation crosses.

4. Materials and Methods

4.1. Plant Material

During the summers of 2019 and 2020, leaf material of 171 individuals of I. lacustris was collected from 24 locations in Michigan and Wisconsin (Figure 1) and dried in silica gel. The number of individuals per population ranged from 1 to 12, depending on the suitability of the population for collection. Most individual plants were collected at least 3 m from each other to maximize the possibility of sampling genets, not ramets. Latitude and longitude were recorded for each specimen.

4.2. DNA Sequencing

Leaf material was sent to data2bio (www.data2bio.com, accessed on 1 May 2023) for DNA isolation and tunable Genotyping-by-Sequencing (tGBS) to recognize single nucleotide polymorphisms (SNPs) across the populations. Using the restriction enzyme Bsp1286I, paired-end tGBS libraries were created [15] and subsequently sequenced with an Illumina HiSeq X (Illumina Inc., San Diego, CA, USA). Based on all sequence data, consensus reference sequences were generated with CD-HIT-454 [32] after sequencing depth was normalized to 50×, and sequencing errors were corrected using Fiona [33]. Low-quality reads were discarded (PHRED quality < 15 and error rates ≥ 3%) and trimmed, and GSNAP [34] was employed to map reads to the reference sequences based on the following parameters: ≤2 mismatches per 36 bp and less than five total per 75 bp for tails. SNPs were identified based on the following criteria: two most common alleles supported by at least 30% of the aligned bases, at least five unique reads, the sum of the one or two most common alleles covering at least 80% of the aligned reads, and no polymorphisms in the first or last three base pairs of each read. From the SNPs, two datasets were created: MCR90 with up to 10% missing data and MCR50 with up to 50% missing data.

4.3. Polyploidy Filtering

Because I. lacustris is a putative polyploid and many population genetic methods assume that species are (at most) diploid, polyRAD [24] was used to identify and filter loci that are diploid and tetraploid. The MCR90 and MCR50 datasets were filtered using the IteratePopStruct command to identify genotypes, and then the Hind/HE statistic [24,35] was employed to recognize diploid loci with Hind/HE < 0.5 and tetraploid loci with Hind/HE > 0.75. Datasets were created for each set of loci (Table 2). The number of SNPs in the diploid and tetraploid datasets does not equal the value in the initial datasets because of filtering with polyRAD.

4.4. Population Genomics

Observed and expected heterozygosity measurements and F-statistics were calculated with hierfstat [36,37], and AMOVA was conducted with poppr [38]. All 24 populations were examined, as were the populations divided into three and four geographic clusters, which are based on the optimal K values from preliminary analyses in fastSTRUCTURE (Table 2) and patterns of population structure from STRUCTURE and MavericK. fastSTRUCTURE [39] was employed to identify population structure, including the optimal number of clusters (K), and for these analyses, K = 1–24 were analyzed for the six SNP datasets, using Structure_threader [40], on the Kettering University High-Performance Computing Cluster (KUHPC). Ten replicates were run for each K, with a convergence criterion of 0.000001, a simple prior, and 100 test sets for cross-validation. The CLUMPAK main pipeline, which includes CLUMPP [41] and DISTRUCT [42], was employed to organize, cluster, and visualize the results of independent fastSTRUCTURE analyses, via 10,000 permutations of the LargeKGreedy algorithm [43]. To identify the optimal K value(s), the marginal likelihood that maximizes model complexity from fastSTRUCTURE and the MedMedK, MedMeanK, MaxMedK, and MaxMeanK values determined by StructureSelector [44,45] were examined. These latter four metrics are useful for uneven sampling and are based on recognizing the number of clusters that include, at minimum, one subpopulation. Differences among these metrics are the result of the arithmetic mean or median used and the median or maximum number of clusters identified [45].
For comparison, and given potential variation in ploidy at loci [27], STRUCTURE [25] and MavericK [26] were also used, with Structure_threader, for analyses with the three MCR90 datasets. With STRUCTURE, the following parameters were used with K = 1–24: 1,000,000 steps and 500,000 burnin, with alpha and lambda of 1, and with or without admixture. Ten replicates were run for each K. CLUMPAK and StructureSelector were also used for STRUCTURE analyses, with the best K also determined via the method of Evanno et al. [46] and Ln Pr (X|K). MavericK analyses were run for K = 1–12 with five replicates per K, without admixture, using the following parameters for each replicate: 50,000 steps and 5000 burnin for Markov Chain Monte Carlo (MCMC) sampling and an alpha of 1500 steps and 5000 burnin, with 50 rungs, for thermodynamic integration (TI) sampling, and 100 expectation-maximization repeats. With MavericK, graphs were visualized with R [47], and the optimal K value was determined using TI.
To explicitly include geographical data along with SNPs to investigate patterns of population genetics, tess3r [48] and conStruct [49] were used, and all datasets were analyzed with the former, but only the three MCR90 datasets with the latter. For tess3r, the alternating projected least squares method was undertaken for K = 1–24 for MCR90 and K = 1–12 for MCR50 datasets. Results for each K were visualized with bar graphs and maps in R [47], and the optimal K value was identified using the cross-validation plot for each dataset. For conStruct cross-validation, analyses were conducted with five replicates, for K = 1–8, using 10,000 MCMC iterations sampled every 1000 iterations and a training proportion of 0.5–0.8, depending on the dataset. Subsequently, analyses with K = 3–5 were conducted, with five replicates, using one chain run for 100,000 MCMC iterations sampled every 1000 iterations and with the spatial model.
In addition to analyses for explicit population structure, all datasets were analyzed with principal component analyses (PCA), correspondence analyses (CA), and discriminant analyses of principal components (DAPC) in adegenet [50], principal coordinate analyses (PCoA) in hierfstat [36,37], and isolation-by-distance (IBD) analyses in adegenet using separate Mantel tests for population and individuals, with 999 simulations for the Mantel test. For DAPC for each dataset, the Bayesian Information Criterion (BIC) was used to identify the optimal number of clusters, and cross-validation was employed to explore the most appropriate number of PCs to retain for analysis.
Loci under selection were determined with BayeScan [51] using 100,000 iterations, a burnin of 50,000 iterations, a thinning interval of 10, and a sample size of 5000, and for each analysis, 20 pilot runs were conducted, each with 5000 steps. Loci under selection were visualized in R using FST values and a false discovery rate of 0.05.
Demographic history and patterns of migration were explored using BA3-SNPs [52,53], DIYABC Random Forest (DIYABC-RF) [54], and abcranger [55], and only the three MCR90 datasets were used for these analyses, with the three and four aforementioned population clusters used (apart from all 24 populations investigated with MCR90 with BA3-SNPs). For BA3-SNPs, the datasets were each run for 50 million Markov Chain Monte Carlo (MCMC) iterations, with 20 million MCMC burnin iterations, and a sampling interval of 2500 iterations, and the initial parameters for allele frequencies, inbreeding coefficient, and migration rates were tuned to vary between 0.2–0.6. For DIYABC-RF, the optimal scenario for patterns of diversification were examined among all 15 arrangements of four bifurcating populations. For each scenario, population size was modelled to vary after populations split and one and two other times for when the second and first populations diverge (Supplemental Figure S17). For analyses, all genetic diversity, FST distances, Nei’s distances, and admixture estimates were selected, and the analyses were run for 15 million simulations with a batch size of 1000. Using the results of the training, a random forest analysis was conducted with abcranger [55] using 1000 trees to identify the number of trees supporting each model and to estimate the parameters of the model, with and without linear discriminant analysis, for partial least squares (PLS) estimation on the optimal model for each dataset.

4.5. Conservation Units

Conservation and management units were identified following the three-step method of Funk et al. [16], in which (1) evolutionarily significant units (ESUs) are recognized using all loci, (2) management units (MUs) are delimited with non-outlier loci, and (3) adaptive groups are determined using outlier loci. For the three steps, fastSTRUCTURE [39], PCA, and DAPC were used [50]. The first step was described above for datasets with all loci, and the other two steps were conducted using the same parameters for the three analyses and were based on two datasets (loci under and not under selection as determined via BayeScan [51]) for each MCR50 dataset and the all loci dataset of MCR90 (Table 2). The optimal K value was identified using StructureSelector [44], the marginal likelihood that maximizes model complexity from fastSTRUCTURE [39], and the BIC for DAPC with adegenet [50]. Based on the results of these analyses, ESUs, MUs, and adaptive groups were identified (Supplemental Figure S18).

5. Conclusions

The present study provides evidence of genomic variation and local adaptation across the geographic range of the species, which is novel given the negligible genetic diversity previously recovered for I. lacustris [8,9,10]. However, as Van Kley and Wujek [6] stated thirty years ago, “Despite a preference for a somewhat disturbed habitat, Iris lacustris will not grow where the habitat has been destroyed by residential, resort, or industrial development”. Therefore, the conservation genetic results are of limited value if management steps are not taken to ensure that individuals of I. lacustris have the opportunity to be successful in situ. This includes not only ensuring intermediate light conditions and limited litter [5,6], but also that as much genetic diversity across the entire geographic range of the species is conserved and managed appropriately. Indeed, given the local genetic diversity recognized among the nine adaptive units, it would be prudent to strive to conserve representatives from these areas. This is particularly important because the populations that are best able to adapt to the changing climate in the Great Lakes region is presently unknown [56]. Therefore, to ensure the longevity of this charismatic species, appropriate long-term management is necessary. Future work that includes the populations of I. lacustris from Ontario can extend the presented results to investigate the ways in which these populations relate to those in the United States. Given the international geographic range of the species, conservation efforts that are binational would be particularly useful.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants12132557/s1, Figure S1. Results for Isolation-by-Distance (IBD) for the six datasets. The x-axis is geographic distance, and the y-axis is genetic distance. Figure S2. Structure bar graphs from STRUCTURE for the six datasets analyzed in the present study for K (clusters) = 3–5. Individual ancestry denoted by color. Populations are denoted below each graph. Figure S3. Structure bar graphs from MavericK, without admixture, for the three MCR90 datasets analyzed in the present study for K (clusters) = 3–5. Individual ancestry denoted by color. Populations are denoted below each graph. Figure S4. tess3r maps of population assignation for the six datasets analyzed in the present study for K (clusters) = 3–5. Individual ancestry denoted by color. Figure S5. Maps and bar graphs of population assignation for the three MCR90 datasets analyzed in the present study for K (clusters) = 3–5. Individual ancestry denoted by color. Figure S6. Results for best K from StructureSelector for analyses with fastStructure for (A) MCR90 all loci, (B) MCR90 diploid loci, and (C) MCR90 tetraploid loci. Figure S7. Results for best K from StructureSelector for analyses with fastStructure for (A) MCR50 all loci, (B) MCR50 diploid loci, and (C) MCR50 tetraploid loci. Figure S8. Results for best K from StructureSelector for analyses with Structure without admixture for (A) MCR90 all loci, (B) MCR90 diploid loci, and (C) MCR90 tetraploid loci. Figure S9. Results for best K from StructureSelector for analyses with Structure with admixture for (A) MCR90 all loci, (B) MCR90 diploid loci, and (C) MCR90 tetraploid loci. Figure S10. Results for best K from MavericK, based on thermodynamic integration (TI), for analyses without admixture (A) MCR90 all loci, (B) MCR90 diploid loci, and (C) MCR90 tetraploid loci. Figure S11. Results for cross-validation scores for tess3r analyses for (A) MCR90 all loci, (B) MCR90 diploid loci, (C) MCR90 tetraploid loci, (D), MCR50 all loci, (E) MCR50 diploid loci, and (F) MCR50 tetraploid loci. Figure S12. Results for cross-validation scores for conStruct validation analyses for (A) MCR90 all loci, (B) MCR90 diploid loci, and (C) MCR90 tetraploid loci to identify best K (clusters). Graphs with blue and green dots are for spatial and non-spatial models, respectively, and graph with only blue dots displays predictive accuracy for spatial model with confidence intervals. Figure S13. Results for Bayesian Information Criterion (BIC), to identify best K (clusters), from discriminant analysis of principal components (DAPC) for (A) MCR90 all loci, (B) MCR90 diploid loci, (C) MCR90 tetraploid loci, (D), MCR50 all loci, (E) MCR50 diploid loci, and (F) MCR50 tetraploid loci. Figure S14. Results from StructureSelector for best K (clusters) fastStructure analyses for loci under selection for (A) MCR90 all loci, (B) MCR50 all loci, (C) MCR50 diploid loci, and (D) MCR50 tetraploid loci. Figure S15. Results from StructureSelector for best K (clusters) fastStructure analyses for loci not under selection for (A) MCR90 all loci, (B) MCR50 all loci, (C) MCR50 diploid loci, and (D) MCR50 tetraploid loci. Figure S16. Results for Bayesian Information Criterion (BIC), to identify best K (clusters), from discriminant analysis of principal components (DAPC) analyses for (A) MCR90 all loci under selection, (B) MCR50 all loci under selection, (C) MCR50 diploid loci under selection, (D), MCR50 tetraploid loci under selection, (E) MCR90 all loci not under selection, (F) MCR50 all loci not under selection, (G) MCR50 diploid loci not under selection, (H), MCR50 tetraploid loci not under selection. Figure S17. 15 branching scenarios evaluated in DIYABC. Pop 1 is East, Pop 2 is Mid 1, Pop 3 is Mid 2, Pop 4 is West. See Table 1 for population assignation to each population. Change in color represents potential change in population size. Scenario 3 is optimal for all and tetraploid loci, and scenario 7 is optimal for diploid loci. Figure S18. Nine Adaptive Units recognized from population genetic analyses using loci under selection. Map of locations sampled in present study. Dark gray entire lines denote division between East, Mid1, Mid2, and West clusters (also recognized as Management Units). The dashed gray line separates Mid1 and Mid2 populations, and Mid includes both groups of populations together. Light gray lines separate Wisconsin (USA), Michigan (USA), and Ontario (Canada). Scale bar, in red, represents 50 kilometers. Table S1. AMOVA results for all datasets. Table S2. K values for the MCR90 datasets for STRUCTURE and Maverick.

Author Contributions

Conceptualization, J.I.C. and S.T.-C.; methodology, J.I.C. and S.T.-C.; formal analysis, J.I.C.; data curation, J.I.C.; writing—original draft preparation, J.I.C.; writing—review and editing, J.I.C. and S.T.-C.; visualization, J.I.C. and S.T.-C.; funding acquisition, J.I.C. and S.T.-C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a Kettering University Faculty Research Fellowship, the Michigan Natural Features Inventory (MNFI), and Weber State University. The APC was funded by MDPI.

Data Availability Statement

Data files in VCF format are available at the Dryad repository (https://doi.org/10.5061/dryad.xwdbrv1jh).

Acknowledgments

The authors thank R. Hackett who collected the vast majority of the leaf material, and R. Bowman collected leaf material from Wisconsin. R. Hackett, P. Higman, C. Tansy, J. Dingledine, and S. Hicks provided invaluable conversation about the Dwarf Lake Iris. R. Hackett’s comments on a draft of the manuscript were very helpful. Three reviewers provided helpful comments to improve the manuscript. We appreciate L. Coffey and data2bio for tGBS sequencing. L. Clark provided helpful comments on using polyRAD.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Nuttall, T. The Genera of North American Plants: And a Catalogue of the Species, to the Year 1817; D. Heartt: Philadelphia, PA, USA, 1817; Volume 1. [Google Scholar]
  2. U.S. Fish and Wildlife Service. Status Review—Dwarf Lake Iris (Iris lacustris); East Lansing Field Office: East Lansing, MI, USA, 2022; p. 10.
  3. Voss, E.G. Michigan Flora, 3rd ed.; Cranbrook Institute of Science: Bloomfield Hills, MI, USA; University of Michigan Herbarium: Ann Arbor, MI, USA, 1972; Volume 1, p. 488. [Google Scholar]
  4. Brotske, V. Pollination, Seed Dispersal, Germination, and Seedling Survival in the Federally Threatened Dwarf Lake Iris (Iris Lacustris). Master’s Thesis, University of Wisconsin-Green Bay, Green Bay, WI, USA, 2018. [Google Scholar]
  5. U.S. Fish and Wildlife Service. 5-Year Review Dwarf Lake Iris (Iris lacustris); U.S. Fish and Wildlife Service: East Lansing, MI, USA, 2011; p. 21.
  6. Van Kley, J.E.; Wujek, D.E. Habitat and ecology of Iris lacustris (the dwarf lake iris). Mich. Bot. 1993, 32, 209–222. [Google Scholar]
  7. State of Michigan. State Facts and Symbols. Available online: https://www.michigan.gov/som/about-michigan/state-facts-and-symbols (accessed on 15 January 2023).
  8. Simonich, M.T.; Morgan, M.D. Allozymic uniformity in Iris lacustris (dwarf lake iris) in Wisconsin. Can. J. Bot. 1994, 72, 1720–1722. [Google Scholar] [CrossRef]
  9. Orick, M.W. Enzyme Polymorphism and Genetic Diversity in the Great Lakes Endemic Iris lacustris Nutt. (Dwarf Lake Iris). Master’s Thesis, Eastern Michigan University, Ypsilanti, MI, USA, 1992. [Google Scholar]
  10. Hannan, G.L.; Orick, M.W. Isozyme diversity in Iris cristata and the threatened glacial endemic I. lacustris (Iridaceae). Am. J. Bot. 2000, 87, 293–301. [Google Scholar] [CrossRef]
  11. Guo, J.; Wilson, C.A. Molecular phylogeny of crested Iris based on five plastid markers (Iridaceae). Syst. Bot. 2013, 38, 987–995. [Google Scholar] [CrossRef]
  12. Soltis, P.S.; Soltis, D.E. The role of genetic and genomic attributes in the success of polyploids. Proc. Natl. Acad. Sci. USA 2000, 97, 7051–7057. [Google Scholar] [CrossRef] [PubMed]
  13. Luttikhuizen, P.C.; Stift, M.; Kuperus, P.; Van Tienderen, P.H. Genetic diversity in diploid vs. tetraploid Rorippa amphibia (Brassicaceae). Mol. Ecol. 2007, 16, 3544–3553. [Google Scholar] [CrossRef] [PubMed]
  14. Van de Peer, Y.; Ashman, T.-L.; Soltis, P.S.; Soltis, D.E. Polyploidy: An evolutionary and ecological force in stressful times. Plant Cell 2021, 33, 11–26. [Google Scholar] [CrossRef] [PubMed]
  15. Ott, A.; Liu, S.; Schnable, J.C.; Yeh, C.-T.E.; Wang, K.-S.; Schnable, P.S. tGBS® genotyping-by-sequencing enables reliable genotyping of heterozygous loci. Nucleic Acids Res. 2017, 45, e178. [Google Scholar] [CrossRef] [Green Version]
  16. Funk, W.C.; McKay, J.K.; Hohenlohe, P.A.; Allendorf, F.W. Harnessing genomics for delineating conservation units. Trends Ecol. Evol. 2012, 27, 489–496. [Google Scholar] [CrossRef] [Green Version]
  17. Millar, M.A.; Byrne, M. Variable clonality and genetic structure among disjunct populations of Banksia mimica. Conserv. Genet. 2020, 21, 803–818. [Google Scholar] [CrossRef]
  18. Edgeloe, J.M.; Severn-Ellis, A.A.; Bayer, P.E.; Mehravi, S.; Breed, M.F.; Krauss, S.L.; Batley, J.; Kendrick, G.A.; Sinclair, E.A. Extensive polyploid clonality was a successful strategy for seagrass to expand into a newly submerged environment. Proc. R. Soc. B 2022, 289, 20220538. [Google Scholar] [CrossRef] [PubMed]
  19. Sessa, E.B. Polyploidy as a mechanism for surviving global change. New Phytol. 2019, 221, 5–6. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  20. Fant, J.B.; Havens, K.; Keller, J.M.; Radosavljevic, A.; Yates, E.D. The influence of contemporary and historic landscape features on the genetic structure of the sand dune endemic, Cirsium pitcheri (Asteraceae). Heredity 2014, 112, 519–530. [Google Scholar] [CrossRef] [Green Version]
  21. Kincare, K.; Larson, G.J. Evolution of the Great Lakes. In Michigan Geography and Geology; Schaetzl, R.J., Darden, J.T., Brandt, D., Eds.; Pearson Custom Publishing: Boston, MA, USA, 2009; pp. 174–190. [Google Scholar]
  22. Larson, G.; Schaetzl, R. Origin and evolution of the Great Lakes. J. Great Lakes Res. 2001, 27, 518–546. [Google Scholar] [CrossRef]
  23. Chung, M.Y.; López-Pujol, J.; Lee, Y.M.; Oh, S.H.; Chung, M.G. Clonal and genetic structure of Iris odaesanensis and Iris rossii (Iridaceae): Insights of the Baekdudaegan Mountains as a glacial refugium for boreal and temperate plants. Plant Syst. Evol. 2015, 301, 1397–1409. [Google Scholar] [CrossRef] [Green Version]
  24. Clark, L.V.; Lipka, A.E.; Sacks, E.J. polyRAD: Genotype calling with uncertainty from sequencing data in polyploids and diploids. G3 Genes Genomes Genet. 2019, 9, 663–673. [Google Scholar] [CrossRef] [Green Version]
  25. Pritchard, J.K.; Stephens, M.; Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 2000, 155, 945–959. [Google Scholar] [CrossRef] [PubMed]
  26. Verity, R.; Nichols, R.A. Estimating the number of subpopulations (K) in structured populations. Genetics 2016, 203, 1827–1839. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Stift, M.; Kolář, F.; Meirmans, P.G. STRUCTURE is more robust than other clustering methods in simulated mixed-ploidy populations. Heredity 2019, 123, 429–441. [Google Scholar] [CrossRef] [Green Version]
  28. Chafin, T.K.; Regmi, B.; Douglas, M.R.; Edds, D.R.; Wangchuk, K.; Dorji, S.; Norbu, P.; Norbu, S.; Changlu, C.; Khanal, G.P. Parallel introgression, not recurrent emergence, explains apparent elevational ecotypes of polyploid Himalayan snowtrout. R. Soc. Open Sci. 2021, 8, 210727. [Google Scholar] [CrossRef]
  29. Salvado, P.; Aymerich Boixader, P.; Parera, J.; Vila Bonfill, A.; Martin, M.; Quélennec, C.; Lewin, J.M.; Delorme-Hinoux, V.; Bertrand, J.A.M. Little hope for the polyploid endemic Pyrenean Larkspur (Delphinium montanum): Evidences from population genomics and Ecological Niche Modeling. Ecol. Evol. 2022, 12, e8711. [Google Scholar] [CrossRef] [PubMed]
  30. Barnes, B.V.; Wagner, W.H., Jr. Michigan Trees. A Guide to the Trees of Michigan and the Great Lakes Region; University of Michigan Press: Ann Arbor, MI, USA, 1981. [Google Scholar]
  31. Walker, W.S.; Barnes, B.V.; Kashian, D.M. Landscape ecosystems of the Mack Lake burn, northern Lower Michigan, and the occurrence of the Kirtland’s warbler. For. Sci. 2003, 49, 119–139. [Google Scholar] [CrossRef]
  32. Fu, L.; Niu, B.; Zhu, Z.; Wu, S.; Li, W. CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 2012, 28, 3150–3152. [Google Scholar] [CrossRef]
  33. Schulz, M.H.; Weese, D.; Holtgrewe, M.; Dimitrova, V.; Niu, S.; Reinert, K.; Richard, H. Fiona: A parallel and automatic strategy for read error correction. Bioinformatics 2014, 30, i356–i363. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Wu, T.D.; Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 2010, 26, 873–881. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Clark, L.V.; Mays, W.; Lipka, A.E.; Sacks, E.J. A population-level statistic for assessing Mendelian behavior of genotyping-by-sequencing data from highly duplicated genomes. BMC Bioinform. 2022, 23, 101. [Google Scholar] [CrossRef]
  36. De Meeûs, T.; Goudet, J. A step-by-step tutorial to use HierFstat to analyse populations hierarchically structured at multiple levels. Infect. Genet. Evol. 2007, 7, 731–735. [Google Scholar] [CrossRef] [Green Version]
  37. Goudet, J. HIERFSTAT, a package for R to compute and test hierarchical F-statistics. Mol. Ecol. Notes 2005, 5, 184–186. [Google Scholar] [CrossRef] [Green Version]
  38. Kamvar, Z.N.; Tabima, J.F.; Grünwald, N.J. Poppr: An R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ 2014, 2, e281. [Google Scholar] [CrossRef] [Green Version]
  39. Raj, A.; Stephens, M.; Pritchard, J.K. fastSTRUCTURE: Variational inference of population structure in large SNP data sets. Genetics 2014, 197, 573–589. [Google Scholar] [CrossRef] [Green Version]
  40. Pina-Martins, F.; Silva, D.N.; Fino, J.; Paulo, O.S. Structure_threader: An improved method for automation and parallelization of programs structure, fastStructure and MavericK on multicore CPU systems. Mol. Ecol. Res. 2017, 17, e268–e274. [Google Scholar] [CrossRef]
  41. Jakobsson, M.; Rosenberg, N.A. CLUMPP: A cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 2007, 23, 1801–1806. [Google Scholar] [CrossRef] [Green Version]
  42. Rosenberg, N.A. DISTRUCT: A program for the graphical display of population structure. Mol. Ecol. Notes 2004, 4, 137–138. [Google Scholar] [CrossRef]
  43. Kopelman, N.M.; Mayzel, J.; Jakobsson, M.; Rosenberg, N.A.; Mayrose, I. CLUMPAK: A program for identifying clustering modes and packaging population structure inferences across K. Mol. Ecol. Resour. 2015, 15, 1179–1191. [Google Scholar] [CrossRef] [Green Version]
  44. Li, Y.L.; Liu, J.X. STRUCTURESELECTOR: A web-based software to select and visualize the optimal number of clusters using multiple methods. Mol. Ecol. Resour. 2018, 18, 176–177. [Google Scholar] [CrossRef]
  45. Puechmaille, S.J. The program STRUCTURE does not reliably recover the correct population structure when sampling is uneven: Subsampling and new estimators alleviate the problem. Mol. Ecol. Resour. 2016, 16, 608–627. [Google Scholar] [CrossRef] [PubMed]
  46. Evanno, G.; Regnaut, S.; Goudet, J. Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study. Mol. Ecol. 2005, 14, 2611–2620. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. R Developement Core Team. A Language and Environment for Statistical Computing. 2009. Available online: http://www.R-project.org (accessed on 5 January 2023).
  48. Caye, K.; Jay, F.; Michel, O.; François, O. Fast inference of individual admixture coefficients using geographic data. Ann. Appl. Stat. 2018, 12, 586–608. [Google Scholar] [CrossRef] [Green Version]
  49. Bradburd, G.S.; Coop, G.M.; Ralph, P.L. Inferring continuous and discrete population genetic structure across space. Genetics 2018, 210, 33–52. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Jombart, T.; Ahmed, I. adegenet 1.3-1: New tools for the analysis of genome-wide SNP data. Bioinformatics 2011, 27, 3070–3071. [Google Scholar] [CrossRef] [Green Version]
  51. Foll, M.; Gaggiotti, O. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: A Bayesian perspective. Genetics 2008, 180, 977–993. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  52. Mussmann, S.M.; Douglas, M.R.; Chafin, T.K.; Douglas, M.E. BA3-SNPs: Contemporary migration reconfigured in BayesAss for next-generation sequence data. Methods Ecol. Evol. 2019, 10, 1808–1813. [Google Scholar] [CrossRef] [Green Version]
  53. Wilson, G.A.; Rannala, B. Bayesian inference of recent migration rates using multilocus genotypes. Genetics 2003, 163, 1177–1191. [Google Scholar] [CrossRef] [PubMed]
  54. Collin, F.D.; Durif, G.; Raynal, L.; Lombaert, E.; Gautier, M.; Vitalis, R.; Marin, J.M.; Estoup, A. Extending approximate Bayesian computation with supervised machine learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest. Mol. Ecol. Resour. 2021, 21, 2598–2613. [Google Scholar] [CrossRef] [PubMed]
  55. Collin, F.-D.; Estoup, A.; Marin, J.-M.; Raynal, L. Bringing ABC inference to the machine learning realm: AbcRanger, an optimized random forests library for ABC. In Proceedings of the JOBIM 2020, Montpellier, France, 30 June 2020. [Google Scholar]
  56. Byun, K.; Chiu, C.-M.; Hamlet, A.F. Effects of 21st century climate change on seasonal flow regimes and hydrologic extremes over the Midwest and Great Lakes region of the US. Sci. Total Environ. 2019, 650, 1261–1277. [Google Scholar] [CrossRef]
Figure 1. Map of locations sampled in present study. Dark gray entire lines denote division between East, Mid1, Mid2, and West clusters (also recognized as management units). The dashed gray line separates Mid1 and Mid2 populations, and Mid includes both groups of populations together. Light gray lines separate Wisconsin (USA), Michigan (USA), and Ontario (Canada). Scale bar is 100 km, with each section representing 50 km.
Figure 1. Map of locations sampled in present study. Dark gray entire lines denote division between East, Mid1, Mid2, and West clusters (also recognized as management units). The dashed gray line separates Mid1 and Mid2 populations, and Mid includes both groups of populations together. Light gray lines separate Wisconsin (USA), Michigan (USA), and Ontario (Canada). Scale bar is 100 km, with each section representing 50 km.
Plants 12 02557 g001
Figure 2. Structure bar graphs from fastSTRUCTURE for the six datasets analyzed in the present study for K = 3–5. Individual ancestry denoted by color.
Figure 2. Structure bar graphs from fastSTRUCTURE for the six datasets analyzed in the present study for K = 3–5. Individual ancestry denoted by color.
Plants 12 02557 g002
Figure 3. Patterns of migration based on MCR90 all loci (AC) and MCR90 diploid loci (D,E) as resolved using BA3-SNPs. (A) All populations, (B,D) 4 populations, (C,E) 3 populations. Outermost circle denotes each population, and inner circle shows origin of migrants from each population. Lines connecting populations demonstrate patterns of migration.
Figure 3. Patterns of migration based on MCR90 all loci (AC) and MCR90 diploid loci (D,E) as resolved using BA3-SNPs. (A) All populations, (B,D) 4 populations, (C,E) 3 populations. Outermost circle denotes each population, and inner circle shows origin of migrants from each population. Lines connecting populations demonstrate patterns of migration.
Plants 12 02557 g003
Figure 4. Structure bar graphs for MCR90 all loci and three MCR50 datasets for loci under and not under selection (adaptive units and management units, respectively). Individual ancestry denoted by color. Groups for each listed in Table 1, and best K values noted in Table 2.
Figure 4. Structure bar graphs for MCR90 all loci and three MCR50 datasets for loci under and not under selection (adaptive units and management units, respectively). Individual ancestry denoted by color. Groups for each listed in Table 1, and best K values noted in Table 2.
Plants 12 02557 g004
Table 1. Population and sampling information and assignation of populations to clusters based on results of various population genomic analyses, including recognition of management and adaptive units, based on loci not under and under selection, respectively. Cluster, management unit, and adaptive unit assignation is based on population genetic analyses with fastStructure, discriminant analysis of principal components (DAPC), principal component analyses (PCA), and others described in the text.
Table 1. Population and sampling information and assignation of populations to clusters based on results of various population genomic analyses, including recognition of management and adaptive units, based on loci not under and under selection, respectively. Cluster, management unit, and adaptive unit assignation is based on population genetic analyses with fastStructure, discriminant analysis of principal components (DAPC), principal component analyses (PCA), and others described in the text.
Populations SampledNumber of Individuals SampledFour Population Clusters in AnalysesThree Population Clusters in AnalysesManagement Units (All Loci)Management Units (Diploid and Tetraploid Loci)Adaptive Units
MI110EastEast111
MI23Mid1Mid222
MI38Mid1Mid223
MI47Mid1Mid223
MI57Mid2Mid334
MI614WestWest445
MI78Mid2Mid224
MI84Mid2Mid334
MI93Mid2Mid226
MI105EastEast137
MI111Mid1Mid223
MI122Mid1Mid223
MI133Mid2Mid334
MI1413EastEast137
MI158EastEast117
MI163WestWest425
MI1710Mid2Mid226
MI1810EastEast111
MI1910Mid2Mid226
MI203Mid1Mid222
MI217EastEast131
MI228Mid2Mid334
WI412WestWest428
WI512WestWest429
Table 2. Information on six SNP (single nucleotide polymorphism) datasets examined including best K (cluster) value under various analyses. Dashes indicate analysis was not performed for dataset. StructureSelector results include MedMedK, MedMeanK, MaxMedK, and MaxMeanK, and, therefore, may have a range of best K values due to different results from these four metrics. DAPC is discriminant analysis of principal components, and for these analyses, best K value is determined via Bayesian Information Criterion. Additional information on identification of loci under selection and best K values in text.
Table 2. Information on six SNP (single nucleotide polymorphism) datasets examined including best K (cluster) value under various analyses. Dashes indicate analysis was not performed for dataset. StructureSelector results include MedMedK, MedMeanK, MaxMedK, and MaxMeanK, and, therefore, may have a range of best K values due to different results from these four metrics. DAPC is discriminant analysis of principal components, and for these analyses, best K value is determined via Bayesian Information Criterion. Additional information on identification of loci under selection and best K values in text.
All LociLoci under SelectionLoci Not under Selection
DatasetSNPsLoci under SelectionStructureSelectorDAPCStructureSelectorDAPCStructureSelectorDAPC
MCR9053544016912–14133–47
MCR90 diploid loci2106294–57----
MCR90 tetraploid loci1382214–56----
MCR50344,50965,0755–7411–131031
MCR50 diploid loci50,13443113–42–39–1072–31
MCR50 tetraploid loci82,23769393–42–38931
Table 3. Observed, expected, and total heterozygosity (HO, HS, HT) and fixation index (FIS) for the three MCR90 datasets for each population. Sample sizes are less than five for MI2, MI8, MI9, MI11, MI12, MI13, MI16, MI20, which could impact calculated statistics.
Table 3. Observed, expected, and total heterozygosity (HO, HS, HT) and fixation index (FIS) for the three MCR90 datasets for each population. Sample sizes are less than five for MI2, MI8, MI9, MI11, MI12, MI13, MI16, MI20, which could impact calculated statistics.
MCR90 All LociMCR90 Diploid LociMCR90 Tetraploid Loci
PopulationHOHSHTFISHOHSHTFISHOHSHTFIS
MI10.05860.05160.0516−0.13650.0640.05190.0519−0.23250.05480.04620.0462−0.1875
MI20.05030.04110.0411−0.22240.05380.04170.0417−0.28830.05320.04310.0431−0.2329
MI30.04720.03070.0307−0.53940.05210.03220.0322−0.61910.04740.03010.0301−0.5768
MI40.05810.04510.0451−0.28730.06510.04790.0479−0.35820.06280.04970.0497−0.2624
MI50.05580.05320.0532−0.0470.0520.04280.0428−0.21610.0510.0410.041−0.2444
MI60.09570.07040.0704−0.35930.10430.07420.0742−0.40640.09330.06750.0675−0.3826
MI70.0540.0490.049−0.10210.05190.0420.042−0.23720.05590.04490.0449−0.2455
MI80.06550.05630.0563−0.16310.06770.05730.0573−0.18160.0670.05550.0555−0.2074
MI90.05940.04010.0401−0.48140.05860.03730.0373−0.57140.06730.04420.0442−0.5217
MI100.06120.04770.0477−0.28420.05540.0430.043−0.2890.05150.03830.0383−0.3455
MI110.0475---0.0527---0.0499---
MI120.05220.03850.0385−0.35780.05920.04110.0411−0.44130.05510.04330.0433−0.2749
MI130.05570.04470.0447−0.24630.05080.04090.0409−0.24340.05430.04010.0401−0.3551
MI140.05350.04880.0488−0.09810.05420.04480.0448−0.21050.04970.04150.0415−0.1989
MI150.05730.05510.0551−0.03950.0590.05160.0516−0.14340.05890.05020.0502−0.1724
MI160.09610.06940.0694−0.3840.09560.06610.0661−0.44640.07950.05410.0541−0.4686
MI170.06710.0620.062−0.08270.06090.04880.0488−0.24780.06470.05240.0524−0.2357
MI180.06390.05670.0567−0.12770.06720.05420.0542−0.24010.06430.05340.0534−0.2054
MI190.06510.05750.0575−0.13240.06450.05120.0512−0.26040.05910.04870.0487−0.215
MI200.04670.03430.0343−0.35970.04810.03510.0351−0.37110.05160.03650.0365−0.4123
MI210.06240.0540.054−0.15570.06280.04970.0497−0.2620.06370.05120.0512−0.2435
MI220.05430.04920.0492−0.10350.04640.03820.0382−0.21350.0490.03970.0397−0.2343
WI40.10810.09460.0946−0.14240.10320.08480.0848−0.21790.08950.07450.0745−0.201
WI50.10330.0850.085−0.21570.10150.07750.0775−0.31020.0940.07340.0734−0.2814
Table 4. Pairwise FST values and heatmap for MCR90 all loci (below diagonal) and MCR90 diploid loci (above diagonal). Below the diagonal, red indicates lower values, and blue is for higher values. Above the diagonal, yellow is for lower values, and green is for higher values. Sample sizes are less than five for MI2, MI8, MI9, MI11, MI12, MI13, MI16, MI20, which could impact calculated statistics.
Table 4. Pairwise FST values and heatmap for MCR90 all loci (below diagonal) and MCR90 diploid loci (above diagonal). Below the diagonal, red indicates lower values, and blue is for higher values. Above the diagonal, yellow is for lower values, and green is for higher values. Sample sizes are less than five for MI2, MI8, MI9, MI11, MI12, MI13, MI16, MI20, which could impact calculated statistics.
MI1MI2MI3MI4MI5MI6MI7MI8MI9MI10MI11MI12MI13MI14MI15MI16MI17MI18MI19MI20MI21MI22WI4WI5
MI1-0.170.200.210.160.250.180.160.220.140.120.170.130.110.090.260.180.050.200.160.050.150.150.19
MI20.29-0.120.120.160.240.160.180.240.200.070.110.130.160.150.270.160.190.170.000.190.110.160.20
MI30.360.26-−0.010.190.210.190.200.260.270.000.050.220.210.200.280.130.220.140.140.220.180.130.16
MI40.370.260.01-0.190.240.180.190.200.25−0.080.030.190.220.200.260.140.230.150.100.230.180.160.19
MI50.260.290.300.31-0.200.080.040.200.120.120.180.080.120.120.230.120.170.140.160.160.060.140.15
MI60.410.390.370.390.31-0.230.190.220.240.140.190.200.240.230.130.200.250.200.210.240.220.190.20
MI70.320.350.370.370.120.35-0.080.190.190.110.180.160.150.150.250.120.200.140.140.180.110.150.18
MI80.310.380.410.400.070.340.14-0.180.150.040.140.100.140.120.200.110.170.130.160.160.090.120.15
MI90.420.430.440.380.320.400.340.39-0.290.180.250.280.240.210.250.100.240.120.250.250.240.130.14
MI100.200.340.440.420.230.380.320.290.45-0.190.250.130.030.060.280.200.130.210.210.090.120.140.19
MI110.300.240.05−0.070.210.310.290.300.380.38-−0.030.120.140.100.130.050.140.060.070.140.120.020.07
MI120.330.250.070.030.260.350.340.360.410.410.01-0.170.190.150.210.130.180.130.140.200.180.090.13
MI130.210.260.370.330.160.330.270.230.440.220.280.31-0.090.090.250.150.140.160.150.140.020.120.15
MI140.150.310.380.390.240.400.300.280.420.040.330.360.18-0.030.260.180.100.190.150.070.090.140.19
MI150.170.320.380.390.240.390.290.260.410.090.300.340.190.04-0.230.160.090.180.130.080.110.120.17
MI160.430.440.440.430.320.190.370.340.420.430.300.370.390.420.39-0.200.250.200.260.260.270.140.14
MI170.280.250.220.250.170.320.210.220.180.270.130.200.210.270.260.28-0.200.070.140.180.130.130.16
MI180.120.330.380.400.290.410.330.320.420.170.320.350.230.140.150.410.29-0.210.180.040.160.150.21
MI190.330.300.280.310.220.350.250.250.270.320.210.260.260.310.300.340.110.33-0.150.200.150.150.17
MI200.29−0.010.290.250.270.360.330.380.450.350.240.280.290.310.310.420.220.320.28-0.180.100.120.16
MI210.090.340.390.400.260.390.310.300.420.130.320.360.210.090.130.410.270.050.320.33-0.140.140.19
MI220.190.220.280.300.090.330.170.150.340.170.210.260.040.150.170.360.160.220.210.210.18-0.150.18
WI40.270.260.230.270.220.300.280.240.270.220.120.190.200.240.230.250.200.250.250.210.240.22-0.15
WI50.330.320.240.280.260.270.300.270.270.300.160.210.260.320.300.210.230.330.270.280.310.260.24-
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cohen, J.I.; Turgman-Cohen, S. The Conservation Genetics of Iris lacustris (Dwarf Lake Iris), a Great Lakes Endemic. Plants 2023, 12, 2557. https://doi.org/10.3390/plants12132557

AMA Style

Cohen JI, Turgman-Cohen S. The Conservation Genetics of Iris lacustris (Dwarf Lake Iris), a Great Lakes Endemic. Plants. 2023; 12(13):2557. https://doi.org/10.3390/plants12132557

Chicago/Turabian Style

Cohen, James Isaac, and Salomon Turgman-Cohen. 2023. "The Conservation Genetics of Iris lacustris (Dwarf Lake Iris), a Great Lakes Endemic" Plants 12, no. 13: 2557. https://doi.org/10.3390/plants12132557

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop