Genetic Diversity Revealed by Microsatellites in Genus Carya

: The genus Carya consists of 17 species divided into 3 sections: Carya or the true hickories, Apocarya or the pecan hickories, and Sinocarya or the Asian hickories. Interspeciﬁc hybrids exist and have been used in pecan cultivar development. Nuclear and plastid microsatellite or SSR markers have been useful in distinguishing species, sections, and populations. They provide evidence for hybridity between species and can conﬁrm heredity within crosses. As more sophisticated methods of genomic evaluation are cooperatively developed for use in pecan breeding and selection, the use of these methods will be supplemented and informed by the lessons provided by microsatellite markers, as interpreted across broad germplasm collections. In this study, over 400 Carya accessions from diverse diploid and tetraploid taxa and their interspeciﬁc hybrids, maintained at the USDA National Collection of Genetic Resources for Carya (NCGR- Carya ), were analyzed using 14 nuclear and 3 plastid microsatellite markers. Principal coordinate analysis showed clear taxonomic classiﬁcations at multiple taxonomic levels along with patterns of interspeciﬁc hybridity. Evidence was also found for genetic differences associated with geographic distribution. The results indicate that this group of markers is useful in examining and characterizing populations and hybrids in the genus Carya and may help delineate the composition of a core collection to help characterize the NCGR- Carya repository collection for use in its pecan breeding program. The SSR ﬁngerprints of the inventories of the USDA NCGR- Carya repository can also be used as a reference for identifying unknown pecan trees for growers.

All members of section Apocarya are diploid (2n = 2x = 32) and frequently hybridize with pecans, providing an avenue for introgression of genes between sympatric populations distributed across the southeastern United States into Mexico. The pecan hickories consist of four species mainly distributed in the South-Central states along the Mississippi River ( Figure 1A). The focus of international horticultural attention has traditionally been on pecan (Carya illinoinensis), the species most widely grown internationally. Native pecans belong to section Apocarya and are primarily distributed in the well-drained soils of the Mississippi River and its tributaries, from northern Illinois and southeastern Iowa south to the Gulf Coast of Louisiana and west to the Edwards Plateau ( Figure 1A) [6]. Section Carya, i.e., North American hickories, or typical hickories, include eight species that are either In addition to these recognized Carya species, there are 11 interspecific hybrids among the Carya species [6]. Hybridity in sympatric species of Carya sharing the same ploidy level is evident in the abundance of known hybrids and has been examined using molecular markers [4]. Species in the genus Carya are classified by their taxonomic, botanical, and horticultural characteristics, along with their ploidy level.
A wide range of genetic diversity within crops greatly benefits plant breeding, especially the modern molecular breeding program. USDA NCGR-Carya maintains a diverse In addition to these recognized Carya species, there are 11 interspecific hybrids among the Carya species [6]. Hybridity in sympatric species of Carya sharing the same ploidy level is evident in the abundance of known hybrids and has been examined using molecular markers [4]. Species in the genus Carya are classified by their taxonomic, botanical, and horticultural characteristics, along with their ploidy level.
A wide range of genetic diversity within crops greatly benefits plant breeding, especially the modern molecular breeding program. USDA NCGR-Carya maintains a diverse collection of Carya species, from sources worldwide, emphasizing native collections and cultivars of pecan [7]. However, owing to their large tree size and long lifespans, it can be challenging to maximize the genetic diversity within such a large collection effectively.
The development and application of molecular markers have greatly aided in collecting and maintaining the repository germplasm [8,9]. SSRs (single sequences repeats) or microsatellite markers have proven to be a valuable tool in the genetic analysis of relatedness between species or within different populations in plants and animals [10][11][12][13][14][15]. In the past two decades, SSRs have been widely used because they fulfill most of the desirable characteristics in a molecular marker, such as a good amount of polymorphism, frequency, an even distribution in the genome, co-dominant inheritance, high reproducibility, and suitability for automation [16]. However, the classic DNA-based marker system has limitations for population genotyping, genetic/genomic mapping, and parentage/hybrid identification [7,[17][18][19][20]. In addition, the reproducibility of DNA-based marker genotyping is a challenge [21][22][23][24]. Next-generation sequencing technology has dramatically decreased the costs of developing SSR markers and can provide a large set of DNA-based markers for plant breeding and population genotyping [10,[25][26][27]. Nevertheless, SSR markers are still valuable, especially for plant breeders who use DNA-based markers to routinely identify a few unknown plants, parentage, or hybrids in small to medium-sized labs [18,20,28].
In pecan, SSR markers have not been widely used in genetic diversity, population structure, and identification of parentage and hybrids because of limited genetic resources. NCGR-Carya evaluated 24 SSR primer pairs from a microsatellite-enriched library, and 19 SSR markers successfully produced amplification products in a group of 48 pecan and hickory accessions [8]. This team then evaluated 8 plastid markers [16,29] and found 3 were polymorphic and informative in a set of 169 Carya accessions [30]. In China, 8 SSR markers were also used to identify 77 pecan accessions, including domestic, introduced, and unknown pecan trees [31]. So far, NCGR-Carya has chosen a panel of 17 polymorphic SSR markers to screen the USDA pecan repository collections [4,9,30,32]. Of them, 14 were nuclear in origin, with 9 originating in pecan (Carya illinoinensis (Wangenh) K. Koch) [8], 1 from a pecan EST sequence (GenBank accession number EST00001) [4], and 4 from a walnut (Juglans nigra) library. These 17 SSR markers have been used to survey the nuclear and plastid genetic variation in representative samples of 80 indigenous pecan trees collected from throughout the native range of the Carya illinoinensis species [4,32]. This study evaluated these 14 nuclear and 3 plastid SSR markers for their utility in distinguishing a multi-species panel of 410 accessions in genus Carya, including interspecific hybrids. Seven accessions in the section Rhysocaryon in Juglans were selected as an outgroup principal coordinate analysis (PCoA). The USDA Pecan Breeding program has a long-term goal of maintaining a large set of repository collections and using them efficiently for new cultivar development. Hence, the species-specific and geographical differentiation identified by these SSR markers may help delineate the composition of a core collection to help accomplish that goal.

Materials
A total of 410 Carya trees included in this research (Table 1 and Supplementary  Table S1) represent 275 accessions of section Apocarya (abbreviation apo), 108 accessions of section Carya (car), 10 accessions of section Sinocarya (sin) (C. tonkinensis in section Sinocrya had only one entry in Table 1 and was excluded from analysis), and 17 interspecific hybrids (hyb). Of the 275 accessions in apo, 251 pecan accessions (Carya illinoinensis) were analyzed for geographical diversity, representing grafted cultivars, native seedling accessions, and selected controlled cross progeny families. The population sizes and collection sites of each species are listed in Table 1 (details refer to Supplementary Table S1). As representatives of Juglans, a set of seven accessions in section Rhysocaryon of a neighboring genus of Juglandaceae, including two J. major accessions (maj), three J. macrocarpa (mac) accessions, and two J. nigra (nig) accessions, were also analyzed as an outgroup for clustering/principal analysis.   tissue samples associated with the herbarium vouchers he collected from Carya across the United States and Mexico, which are maintained in the Duke University herbarium.

Georeferencing
Each accession's decimal latitude and longitude coordinates were determined based on GPS coordinates collected with the tissue or the best location estimate based on passport information. Regional designations for pecan were based on the overlap between three zones of latitude and three of longitude, as shown in Figure 1 [4].

DNA Extraction
Immature leaflets were harvested from verified inventories, immediately rinsed with water, and placed in coded sample tubes on ice for transport to the laboratory. Genomic DNA was extracted from fresh leaves, silica gel-dried leaves, buds, or wood using methods modified from previous reports [1][2][3]. Extractions were performed using the DNeasy Plant Mini Kit (Qiagen, Hilden, Germany), following the manufacturer's instructions. DNA quality was checked through 1% agarose gel. DNA concentration was quantified using NanoDrop 1000 spectrophotometer (ThermoFisher Scientific, MA Scientific, Waltham, MA, USA).

PCR Amplification
Seventeen SSR primers (see Table 1 in Grauke et al. 2015), including 9 from the pecan microsatellite library [8], 1 from the pecan EST library (GenBank accession CV973667), 4 from the walnut GA-enriched library [33], and 3 plastid markers [16,29], were used for PCR amplification. Of these primer pairs, a fluorescent forward primer was labeled at the 5 -end with either 6-FAM (blue color) or HEX (green color). PCR reactions were performed in 10 µL volume consisting of 3 to 5 ng of genomic DNA, 20 mM Tris-HCl (pH 8.

Fragment Analysis
PCR products were loaded on the ABI Prism Genetic Analyzer 3130 (Applied Biosystems, Foster City, CA, USA) by mixing 0.5 µL PCR solutions with 5 µL of 2.5% 400-ROX internal size standard in deionized formamide. Relative sizes of alleles were determined using GeneScan and Genotyper software v 3.7 (Applied Biosystems). Alleles were called as a whole number in bp after binning with Flexibin V2 [34]. Sample preparation referred to the details in [4,8]. The relative size of the allele was automatically called and manually adjusted [3,8,30,32] using GeneScan and Genotyper software v 3.7 (Applied Biosystems). Allele profiles in bp were converted to binary using an Excel macro [35] and evaluated using GenAlex 6.5 [36].

Population Structure Analysis
The variation among/within the inferred populations and population structure was assessed using the program GenAlex 6.5 [36], SAS 9.3 [37], and STRUCTURE 2.3.4 [38]. Molecular profiles were converted to binary form, associated with taxonomic and geographic classification descriptors, and subjected to principal coordinate analysis (PCoA) built in the SAS 9.3 [37] and STRUCTURE 2.3.4 [38] to investigate the genetic relationships among accessions in the Carya sections. Shannon diversity index (I) and heterozygosity (He) of the species in a population were calculated using GenAlex 6.5 [36]. The binary data were run on a model-based clustering software STRUCTURE [38]. An admixture model with uncorrelated allele frequencies was used to test the subpopulation numbers (K = 2-6). Each K was run five times with a burn-in period of 100,000 steps followed by 100,000 Monte Carlo Markov chain (MCMC) replicates. After STRUCTURE, the log probability of data (LnP(D)) was estimated for each run, and an ad hoc statistic, ∆K, which was based on the rate of change in LnP(D) between successive K values, was used to determine the true number of subpopulations [39].

Genetic Differentiation in the Three Sections of Genus Carya
Principal coordinated analysis (PCoA) was performed to determine the genetic relationships among the three sections of genus Carya and section Rhysocaryon in Juglans. The 3 plastid and 14 nuclear SSR markers could clearly distinguish between the sections Apocarya, Carya, and Sinocarya of the genus Carya, and interspecific hybrids between species of different sections, as well as section Rhysocaryon of Juglans ( Figure 2). For the sections as a whole (i.e., all accessions within sections combined), the PCoA plot separated the four sections of Cayra, interspecific hybrids, and section Rhysocaryon in Juglans with a total percentage variation of 44.1%, which was explained by the first coordinate of 24.9% and the second coordinate of 19.2%. This provides valuable insight into the genetic structure of the genus Carya and confirms taxonomic distinctions between and within sections and close association with section Rhysocaryon of Juglans ( Figure 2A). When species in Juglans were included, the PCoA plots indicated that the species in sections Apocarya and Carya and their hybrids were grouped into one cluster and were distinct from the two species in Sinocarya and three species in Juglans, which were grouped into two distinct clusters ( Figure 2A). C. tomentosa is a tetraploid and distinct from the other three tetraploids (tex, gla, and flo) in section Carya. These results indicate that the 3 plastid and 14 nuclear SSR markers provide valuable insight into the genetic structure of the genus Carya, confirming taxonomic distinctions between and within sections and close association with section Rhysocaryon of Juglans.
As shown in the PCoA, section Sinocarya was distinct from the other two sections in genus Carya. All accessions within sections were grouped, with a few Apocarya and Carya accessions overlapping (Figure 2A and Supplementary Figure S1A). For the accessions in section Carya, diploid and tetraploid species formed a cluster that included Apocarya hybrids. Section Sinocarya and outgroup Juglans were distinct from sections Apocarya and Carya in the PCoA results ( Figure 2A). Species in sections Apocarya and Carya can be separated distinctly, with their interspecific hybrids scattered between the sections ( Figure 2B). When all 417 accessions (275 apo, 108 car, 10 sin, 17 hyb, and 7 rhy entries) were included in the PCoA plot, the total variation (16.1%) was decreased, with the first coordinate of 12.3% and the second coordinate of 3.8% (Supplementary Figure S1), indicating a complex variation of the accessions among the three sections in genus Carya. The PCoA plot generally agrees with the Q plot, with which membership coefficients were estimated for each accession in a cluster ( Figure 2C and Supplementary Figure S1B).  Table 1 and Supplementary  Table S1.
As shown in the PCoA, section Sinocarya was distinct from the other two sections in genus Carya. All accessions within sections were grouped, with a few Apocarya and Carya accessions overlapping (Figure 2A and Supplementary Figure S1A). For the accessions in section Carya, diploid and tetraploid species formed a cluster that included Apocarya hybrids. Section Sinocarya and outgroup Juglans were distinct from sections Apocarya and Carya in the PCoA results ( Figure 2A). Species in sections Apocarya and Carya can be separated distinctly, with their interspecific hybrids scattered between the sections ( Figure  2B). When all 417 accessions (275 apo, 108 car, 10 sin, 17 hyb, and 7 rhy entries) were included in the PCoA plot, the total variation (16.1%) was decreased, with the first coordinate of 12.3% and the second coordinate of 3.8% (Supplementary Figure S1), indicating a complex variation of the accessions among the three sections in genus Carya. The PCoA plot generally agrees with the Q plot, with which membership coefficients were estimated for each accession in a cluster ( Figure 2C and Supplementary Figure S1B).

Relationships among Species within Carya Sections
As shown in Figure 2, species in each section in genus Carya formed in one cluster, indicating a close genetic diversity within each section. The three sections are distinct from each other, indicating an extensive genetic diversity among sections ( Figure 3). Hybrids between sections Apocarya and Carya are between these two sections. For example, four species (aqu, cor, ill, and plm) in section Apocarya are grouped into one cluster, and two hybrids (xbr and xlc) are close to this cluster; six species (lac, ovt, flo, gla, tex, and tom) within the section Carya are grouped in one cluster and their hybrids (xio and xnu) are  Table 1 and Supplementary Table S1.

Relationships among Species within Carya Sections
As shown in Figure 2, species in each section in genus Carya formed in one cluster, indicating a close genetic diversity within each section. The three sections are distinct from each other, indicating an extensive genetic diversity among sections ( Figure 3). Hybrids between sections Apocarya and Carya are between these two sections. For example, four species (aqu, cor, ill, and plm) in section Apocarya are grouped into one cluster, and two hybrids (xbr and xlc) are close to this cluster; six species (lac, ovt, flo, gla, tex, and tom) within the section Carya are grouped in one cluster and their hybrids (xio and xnu) are close to this cluster. This PCoA plot was explained by a total variation of 44.7%, with the first and second principal components of 29.9% and 14.7%, respectively. The Asian Carya are the most threatened in the genus and have unique reproductive mechanisms [4]. The band frequency (f), allele frequency (p and q), effective allele number (Ne), Shannon diversity index (I), and heterozygosity (He) were calculated within the species populations ( Table 2). The Asian Carya group (cat and dab) showed the lowest diversity value, which is indicated by a lower Shannon diversity index (I), and the lowest heterozygosity, compared with other species in the other two sections in genus Carya. This evaluation indicated that two species (cat and dab; ton has only one sample and was not included for analysis) were isolated from two other sections and were distinct from each other, although they are increasingly mixed during commercial marketing. versity index (I), and heterozygosity (He) were calculated within the species populations ( Table 2). The Asian Carya group (cat and dab) showed the lowest diversity value, which is indicated by a lower Shannon diversity index (I), and the lowest heterozygosity, compared with other species in the other two sections in genus Carya. This evaluation indicated that two species (cat and dab; ton has only one sample and was not included for analysis) were isolated from two other sections and were distinct from each other, although they are increasingly mixed during commercial marketing.  Table 1 and Supplementary Table S1. Two species, cat and tom, and one hybrid, xla, had higher diversity (I) ( Table 2). This may be caused by the small population size (Table 1). Although two subsets of C. ovata populations from the United States (n = 15, ovt-US) and Mexico (n = 7, ovt-MX) were grouped within section Carya, they are distinct from each other (Figure 3), indicating some level of geographic differentiation. C. myristiciformis is currently classified in section Carya and has been recognized as a morphological intermediate between the sections [4]. With these 17 molecular markers, two subsets of C. myristiciformis populations from the United States (n = 5, myr-US) and Mexico (n = 6, myr-MX) were grouped with four other species in section Apocarya, but are distinct from each other (Figure 3).  Table 1 and Supplementary  Table S1.
Two species, cat and tom, and one hybrid, xla, had higher diversity (I) ( Table 2). This may be caused by the small population size (Table 1). Although two subsets of C. ovata populations from the United States (n = 15, ovt-US) and Mexico (n = 7, ovt-MX) were grouped within section Carya, they are distinct from each other (Figure 3), indicating some level of geographic differentiation. C. myristiciformis is currently classified in section Carya and has been recognized as a morphological intermediate between the sections [4]. With these 17 molecular markers, two subsets of C. myristiciformis populations from the United States (n = 5, myr-US) and Mexico (n = 6, myr-MX) were grouped with four other species in section Apocarya, but are distinct from each other (Figure 3).
Hybrids between species within the section Apocarya, such as xbr (ill × cor) and xlc (ill × aqu), were clustered with their parents in the same group. Interestingly, diploid and tetraploid species of section Carya form a cluster that includes Apocarya hybrids, such as xla (cor × ovt), xio (ovt × ill), and xnu (ill × lac) (Figure 3), meaning the cluster is not related by their ploidy, but instead by their natural characteristics. Gene flow between the sections justifies the inclusion of selected species representatives in core diversity panels and increased attention to their horticultural value in rootstocks or scion development. Note: Na = no. of different alleles; Ne = no. of effective alleles = 1/(p 2 + q 2 ); I = Shannon diversity index = −1× (p × Ln (p) + q × Ln(q)); He = expected heterozygosity = 2 × p × q; uHe = unbiased expected heterozygosity = (2N/(2N−1)) × He, where for diploid binary data and assuming Hardy-Weinberg equilibrium, q = (1 − band freq.) 0.5 and p = 1 − q. Species 'xom' and 'ton' have only 1-2 collections and are excluded for analysis. The sample size in the populations tom and xla is less than 5 and the results should be treated with caution.

Genetic Differentiation of Carya Species Based on Geographic Origins
Although the population size does not determine the genetic diversity and heterozygosity within the group (Table 2), accessions in species within a close geographical location showed certain diversity. When the accessions in C. illinoinensis were analyzed by geographic origin, they were separated distinctly (Figure 4). Collections in the north-central (nc) and northeast (ne) regions of the United States were distinct from other collections. The central (cc) and central east (ce) regions were grouped, while south-central (sc), southwest (sw), and central west (cw) were in another cluster, and the mix entries were grouped into the cluster with cc and ce (Figure 4).   Accessions in each section in genus Carya adapted to their local climates and, therefore, differentiated genetically [4,7,32]. All accessions in this study can be divided into seven populations based on their latitudes and six populations based on their longitudes ( Figure 5). When plotted by longitude based on populations, the percent variance is explained by the first coordinate of 52.3% and the second coordinate of 39.7%, respectively ( Figure 5A). When plotted by latitude as populations, the percentage of variance explained by the first and second coordinates was 55.3% and 21.0%, respectively ( Figure 5B). When plotted by longitude or latitude, some accessions showed overlapping across geographical locations, indicating gene flows across latitude and longitude. The genetic diversity (I) and heterozygosity (He) in a group are similar within a range by longitude and latitude (Table 3), indicating genetic differences associated with geographic distribution.
Patterns of heterozygosity observed within species populations (Table 2) and regional populations (Table 3) have indicated genetic diversity within species and geographical regions. The analysis of the pairwise population matrix of 22 regional populations in the C. illinoinensis species showed patterns of homozygosity or affinity (Table 4). Population MX4 had the highest genetic distance from 16 of the 22 populations, except for DC, MX1, MX2, MX3, MX4, and NLC. Populations MX1, MX2, MX3, and MX4, which were from Mexico, had the greatest genetic distance from the TX5 population, with the highest affinity of 30.45 (average population binary genetic distance) between MX4 and TX5 (Table 4). DC had the greatest genetic distance from IL, and NLC had the greatest genetic distance from DC. Population MX3 had the lowest genetic distance from itself, implying a level of inbreeding that contributed to its selection as the draft template genome sequence [40]. It might also have utility in breeding strategies to develop uniform seedling populations by crossing with selections from other identified inbred populations.

Discussion
Our previous plastid SSRs and nucleic SSRs profiles display different patterns of genetic variation, especially for those populations derived from open-pollinated seeds in diverse geographical origins [32]. The panel in this study contains a set of accessions comprising the four sections of the genus Carya and section Rhysocaryon of the genus Juglans (Figure 1). Based on the consistent allele calls (present = 1 and absent = 0), we used both plastid and nucleic SSRs to investigate the population structure. These markers provided more power to detect population structure and genetic diversity than previous studies that only used three plastid markers [4,30]. The genetic structure of the genus Carya confirms taxonomic distinctions between and within sections and a close association with section Rhysocaryon of Juglans. The distinct diversity within the hybrid populations among the sections in the genus Carya suggests the possible gene flow within and/or between sections, indicating potential utility in pecan cultivar breeding for trait improvement. The extensive genetic diversity among species in the genus Carya presented by the SSR markers, including plastid markers, has been indicated by the C. illinoinensis chloroplast genome sequences [41]. In addition, plastid SSR markers developed for Carya are informative in Juglans, and nuclear microsatellites developed for Juglans are used for Carya, indicating that some markers may be transferrable between genera Juglans and Carya [4,33]. The 17 interspecific hybrids between section Apocarya and section Carya harbored genetic  Supplementary Table S1. Table 3. Sample size (N), band frequency (f), estimated allele frequency (p and q), no. alleles (Na), no. effective alleles (Ne), Shannon diversity index (I), expected (He), and unbiased expected heterozygosity (uHe) for the 251 collections in C. illinoinensis from the United States and Mexico. Note: Na = no. of different alleles; Ne = no. of effective alleles = 1/(p 2 + q 2 ); I = Shannon diversity index = −1 × (p × Ln (p) + q × Ln(q)); He = expected heterozygosity = 2 × p × q; uHe = unbiased expected heterozygosity = (2N/(2N − 1)) × He, where for diploid binary data and assuming Hardy-Weinberg equilibrium, q = (1 − band freq.) 0.5 and p = 1 − q. Abbreviations of the populations: cc-central, ce-central east, cw-central west, nc-north central, ne-northeast, sc-south central, sw-southwest, and mix-mixed. The accessions of each population can be found from column I (Ecozone) of Supplementary Table S1. Table 4. Pairwise population unbiased Nei genetic identity (diagonal) and unbiased Nei genetic distance (above or below the diagonal). ALM  DC  IL  KS  KY  mix  MO  MX1  MX2  MX3  MX4  MX5  NLC  Se  TN  TX1  TX2  TX3  TX4  TX5  TX6  TXSD  Note: Pecan populations were labeled as sampled within the state of origin (Supplementary Table S1). Populations with the highest affinity between populations are shown in green and those with the greatest distance in orange. MX3 showed the lowest affinity and is highlighted in yellow.

Discussion
Our previous plastid SSRs and nucleic SSRs profiles display different patterns of genetic variation, especially for those populations derived from open-pollinated seeds in diverse geographical origins [32]. The panel in this study contains a set of accessions comprising the four sections of the genus Carya and section Rhysocaryon of the genus Juglans ( Figure 1). Based on the consistent allele calls (present = 1 and absent = 0), we used both plastid and nucleic SSRs to investigate the population structure. These markers provided more power to detect population structure and genetic diversity than previous studies that only used three plastid markers [4,30]. The genetic structure of the genus Carya confirms taxonomic distinctions between and within sections and a close association with section Rhysocaryon of Juglans. The distinct diversity within the hybrid populations among the sections in the genus Carya suggests the possible gene flow within and/or between sections, indicating potential utility in pecan cultivar breeding for trait improvement. The extensive genetic diversity among species in the genus Carya presented by the SSR markers, including plastid markers, has been indicated by the C. illinoinensis chloroplast genome sequences [41]. In addition, plastid SSR markers developed for Carya are informative in Juglans, and nuclear microsatellites developed for Juglans are used for Carya, indicating that some markers may be transferrable between genera Juglans and Carya [4,33]. The 17 interspecific hybrids between section Apocarya and section Carya harbored genetic traits from their parents and presented wide genetic diversity, which serve as bridges for traits' improvements in the USDA pecan breeding program.
Patterns of homozygosity observed within some regional populations in the SSR profile for 251 accessions in C. illinoinensis (Table 4) may have future utility in breeding strategies to develop uniform seedling populations. One of them, 87MX3-2.11, has had its whole genome sequenced and has been used as reference to develop genetic tools for accelerating pecan breeding [41]. In addition, existing profiles from verified inventories will be useful in establishing additional methods of molecular verification in other identified inbred populations. Trees in the MX4 population are self-rooted seedlings grown from seeds collected from a putative native stand in Ixmiquilpan, Hidalgo, Mexico, in 1987. Like most Mexican populations (MX1-5), they break bud early in the spring, but are the last to cease growth in the fall, often continuing active growth into December. As a group, they manifest indeterminate growth, resulting in a weeping habit.
Some species, such as C. dabieshanensis and C. cathayensis, found only in Asia and isolated from each other ( Figure 1D), are the most threatened in the genus. All five specimens of C. dabieshanensis are distinct from those in C. cathayensis by the 17 SSR markers in this study, in agreement with the studies by Grauke and Mendoza-Herrera [4], which used only three plastid markers. However, the specimens in Asian species (section Sinocarya) are increasingly mixed during marketing, although C. dabieshanensis is distinct from C. cathayensis ( Figure 2). Efforts should be made to collect and characterize extant native populations and other Asian species not currently available. Given the potential for the hybridization seen in sympatric Carya populations in the United States and Mexico [9], the species in section Sinocarya can be used to characterize the diversity of their native populations to investigate potentially valuable reproductive strategies that have not previously been seen in other Carya species [4].
Although there are distinct separations among the Carya species based on the 17 markers, some clusters still are disputed (Figure 3). For example, C. myristiciformis is currently classified in section Carya and has been recognized as a morphological intermediate between the sections. However, our results showed that this species groups with Apocarya. Specimens in C. myristiciformis exhibit disjunct distributions between the southeastern United States (5 myr-US) and Mexico (5 myr-MX) (Figure 3), which are distinct from each other and show hybridity with pecan [4]. Similarly, 22 accessions in C. ovata separate distinctly from two populations between southeastern United States (15 ovt-US) and Mexico (7 ovt-MX) ( Figure 3) and show hybridity with pecan. The limited accessions in both C. myristiciformis and C. ovata from two geographical locations show higher diversity levels (I = 0.32-0.49), suggesting that gene exchanges exist naturally if no human activity is involved. The roles of C. myristiciformis in the phylogenetic development of the genus may be explored with appropriate inclusion in core diversity panels in future experiments. This species is threatened across its disjunct range from the southeastern United States into Mexico and is included in efforts at in situ conservation.
Gene exchanges by natural selection among geographical locations result in local adaption for plant species even within the same genus [42]. Such genetic variation provides a foundation for species to respond to changes in the natural environment [43]. Previous studies indicated higher levels of genetic variation in northern and central C. illinoinensis populations and lower levels of genetic variation in southern and eastern populations based on nucleic loci [19,32]. In pecan, geographic populations still harbor distinct genetic traits to identify and conserve in ex situ collections, with some of that due to maternally inherited plastids [32]. The SSR profile here provides important insights into the geographical distribution of genetic variation in collections of the genus Carya and the species within its section. Our analyses of 394 Carya species with 17 SSR markers expanded previous contributions by identifying differences in the geographic structure of nuclear and organellar genetic variation [32]. Our SSR data showed a distinctive profile on their geographic distributions of genetic diversity among the eight ecozone populations ( Figure 4). However, there was a low percentage of expected heterozygosity (~5%, data not shown) within each ecozone population. Geographical distributions of the ecozone populations of 104E-119E (pop 6 in Figure 5A) and 16N-20N (pop 1 in Figure 5B) are more distant than other ecozone populations in their respective figures, suggesting that the largest amount of genetic variation exists in distant geographic locations.
Local climate variation was observed in phenotypic patterns of leaf morphology and disease resistance. Regional populations should be represented in core collections for both genomic screening and refinement of phenotypic descriptors. The diversity in the genus Carya showed that the plastid markers were informative in separating geographically diverse Carya populations. New genomic tools that include more plastid markers should be developed because the maternally inherited plastid markers offer improved resolution between geographic regions, relative to the nuclear markers tested [30]. This can be done with the next-generation sequencing technologies developed in other plant species [41]. In pecan breeding programs, accessible markers for the confirmation of cultivar identity are needed by the nursery industry, even if their usage may be limited. The USDA-ARS Pecan Breeding program has been using these 14 nuclear and 3 plastid SSR markers to identify cultivars, hybrids, and unknown varieties. Recently, three of these SSR markers were used to identify F 1 progeny in a cross of "Lakota" × 87MX3-2.11, which was used for candidate gene identification [44]. Existing profiles from verified inventories will be useful in establishing additional methods of molecular verification [18]. Additional microsatellite markers should be selected for their distribution across all chromosomes and high nucleotide repeat numbers (tri and tetra). Next-generation sequencing technology provides a more efficient tool to quickly profile tree species with a greater resolution of interspecific hybridity, especially when their markers (SNPs) are aligned to an available chromosome-scale genome sequence [27,44]. However, SSR markers still have great utility for plant breeding programs and researchers who need markers that have high reproducibility, transferability between species, and platform independence.

Conclusions
Accessions within the Carya species distributed between the eastern United States and northeastern Mexico are genetically diverse. Microsatellites or SSR markers provide enough resolution to distinguish these diversities in the genus Carya. Evidence from both nuclear and maternally inherited molecular markers indicates that the diversity of Carya species in the United States and Mexico results in frequent hybridizations among sympatric trees in the Carya species, which adapt to the constraints of the local environment. The hybrids bring great opportunity for new pecan cultivar development. Microsatellite profiles of the two C. myristiciformis and two C. ovata individuals from the southeastern United States and Mexico are distinct from each other and each indicates hybridity with pecan, offering increased potential for the development of pecan cultivars [4]. Although existing molecular markers provide insights into the understanding of the dynamic diversity of the repository germplasm, a strategy for long-term maintenance of these diverse germplasms should include in situ preservation. Understanding the functional diversity in wild relatives of pecan will require more efficient and powerful tools. Next-generation sequencing technology provides the solution, especially for larger trees with longevity, like pecan. Our four pan genomic sequences in Carya illinoinensis will be the foundation for the development of molecular markers by using marker-assisted breeding methods to speed up the pecan breeding process [44].