Development of Chloroplast Microsatellite Markers and Evaluation of Genetic Diversity and Population Structure of Cutleaf Groundcherry (Physalis angulata L.) in China

Cutleaf groundcherry (Physalis angulata L.), an annual plant containing a variety of active ingredients, has great medicinal value. However, studies on the genetic diversity and population structure of P. angulata are limited. In this study, we developed chloroplast microsatellite (cpSSR) markers and applied them to evaluate the genetic diversity and population structure of P. angulata. A total of 57 cpSSRs were identified from the chloroplast genome of P. angulata. Among all cpSSR loci, mononucleotide markers were the most abundant (68.24%), followed by tetranucleotide (12.28%), dinucleotide (10.53%), and trinucleotide (8.77%) markers. In total, 30 newly developed cpSSR markers with rich polymorphism and good stability were selected for further genetic diversity and population structure analyses. These cpSSRs amplified a total of 156 alleles, 132 (84.62%) of which were polymorphic. The percentage of polymorphic alleles and the average polymorphic information content (PIC) value of the cpSSRs were 81.29% and 0.830, respectively. Population genetic diversity analysis indicated that the average observed number of alleles (Na), number of effective alleles (He), Nei’s gene diversity (h), and Shannon information indices (I) of 16 P. angulata populations were 1.3161, 1.1754, 0.1023, and 0.1538, respectively. Moreover, unweighted group arithmetic mean, neighbor-joining, principal coordinate, and STRUCTURE analyses indicated that 203 P. angulata individuals from 16 populations were grouped into four clusters. A molecular variance analysis (AMOVA) illustrated the considerable genetic variation among populations, while the gene flow (Nm) value (0.2324) indicated a low level of gene flow among populations. Our study not only provided a batch of efficient genetic markers for research on P. angulata but also laid an important foundation for the protection and genetic breeding of P. angulata resources.


Introduction
Cutleaf groundcherry (Physalis angulata L.) is an important annual herbaceous plant from the Solanaceae family, mainly distributed in China, Japan, India, Australia, and the Americas [1,2]. P. angulata has high potential medicinal value and a very long history of being used in traditional medicines around the world. Recent phytochemical and pharmacological studies have confirmed that P. angulata is rich in vitamins, minerals, antioxidants, and many important pharmacologically active constituents, including antibacterial, anti-inflammatory, and anticancer ingredients [3][4][5][6]. In many countries, such as China, Indonesia, Peru, Mexico, and Brazil, P. angulata is often used to treat a variety of illnesses, including dermatitis, tracheitis, impaludism, rheumatism, hepatitis, and analogous conditions [7][8][9][10]. Recently, increasing attention has been paid to the phytochemical and pharmacological aspects of P. angulata, and a variety of bioactive steroids with antitumor activities, including physagulins (A−Q), physangulidines (A−C), withangulatins (A−I), physalins (B, D, F−H), and withaminimin, have been isolated from the species [4,5,11]. Due to their important medicinal value, P. angulata plants have been widely cultivated in some regions of China for decades.
Genetic research to determine genetic diversity and population dynamics can be invaluable when forming and revising a species management plan, as maintaining diversity is critical for conservation [12][13][14]. Information about genetic structure is essential to understanding the scales over which dispersal, genetic drift, and selection operate in populations [15]. The design of microsatellite (simple sequence repeat, SSR) markers is based on conserved nucleotide sequences on both sides of simple repeat sequences, and the polymorphism among alleles is reflected by detecting the difference in the number of repeats [16]. SSRs are widely distributed in the nuclear and organellar genomes of eukaryotes [17][18][19][20]. SSR markers are considered to be among the most ideal molecular markers for the study of plant genetics [14,18]. Chloroplast microsatellites (cpSSRs) are widely distributed in the chloroplast genomes of plants. CpSSR markers not only have the advantages of co-dominance, high polymorphism, and multi-allelic loci, but also have the characteristics of a slow evolution rate, low molecular weight, relative conservation, simple structure, and single-parent inheritance [21]. In recent years, cpSSR markers have been widely used in plant phylogeography [22], species identification [21], phylogeny [23], and genetic diversity and population structure studies [24].
To date, several molecular marker techniques, including the use of SSRs, ISSRs, InDels, and SNP markers, have been used to analyze the genetic diversity of some Physalis species, including P. peruviana, P. philadelphica, and P. floridana [25][26][27][28]. Until recently, however, no cpSSR markers have been developed within the genus Physalis, and there are no reports of cpSSR marker usage for P. angulata. Moreover, there have been no published molecular marker-based studies of genetic diversity in P. angulata populations with a large number of samples. Although P. angulata has many important applications, researchers and plant breeders have not given enough attention to its conservation and improvement. Consequently, the production, utilization, and improvement of P. angulata are severely restricted.
Hence, this study was initiated with the aim of developing cpSSR markers to evaluate the genetic diversity and population structure in P. angulata germplasm collections in China. The findings can eventually serve as an important basis for the genetic improvement and sustainable conservation of P. angulata resources.

Characterization of the Developed cpSSR Markers
In total, 57 cpSSR motifs (including three complex SSR types) were detected in the chloroplast genome of P. angulata, most of which were single-base repetitions dominated by A and T (Tables 1 and 2). Information about the 57 SSR motifs is shown in Table S1. Among all detected SSR loci, mononucleotides were the most abundant, with 39 loci (68.24% of the total). This was followed by tetranucleotides, with seven loci (12.28% of the total), dinucleotides, with six loci (10.53% of the total), and trinucleotides, with five loci (8.77% of the total) ( Table 1). Of the mononucleotide motifs, T (24 of 39, 61.54%) was the most abundant, followed by A (14 out of 39, 35.90%). The C motif was the least frequently observed (one of 39, 0.26%) mononucleotide. The dinucleotide motifs were mainly AT and TA, each accounting for 50%, respectively. Trinucleotide motifs included AAG, ACT, TAA, TTA, and TTC, accounting for 20% each, respectively. Of the tetranucleotide motifs, TTTA (28.57%) and TTTG (28.57%) were the most abundant, followed by AAAC (14.29%), CTAT (14.29%), and CTTA (14.29%) ( Table 2 and Figure 1). The average lengths (bp) of the mononucleotide, dinucleotide, trinucleotide, and tetranucleotide cpSSRs were 12.04, 10.67, 11.6 and 12, respectively (Table S1). Additionally, 39 SSR motifs (68.42%) had repeat numbers greater than or equal to 10 (Table 2 and Figure 1).  TTTG)n  2  -------2  Total  7  5  5  0  1  0  0  39   In this study, 54 cpSSR primer pairs were developed and designed on the basis of the 57 cpSSR motifs detected in the P. angulata chloroplast genome using Primer Premier 5 with manual correction (Table S2). The amplification segments of these cpSSR primer pairs were distributed in four regions (LSC, SSC, IRA, and IRB) of the P. angulata chloroplast genome. Among them, 41 were distributed in the LSC region, along with five in the SSC region, four in the IRA region, and four in the IRB region. Through screening the 54 cpSSR primer pairs with six P. angulata genomic DNA samples, a total of 30 cpSSR primer pairs (55.6%) with good stability, high polymorphism, and clear electrophoresis bands were chosen for further genetic diversity analyses of the P. angulata populations (Table 3). In this study, 54 cpSSR primer pairs were developed and designed on the basis of the 57 cpSSR motifs detected in the P. angulata chloroplast genome using Primer Premier 5 with manual correction (Table S2). The amplification segments of these cpSSR primer pairs were distributed in four regions (LSC, SSC, IRA, and IRB) of the P. angulata chloroplast genome. Among them, 41 were distributed in the LSC region, along with five in the SSC region, four in the IRA region, and four in the IRB region. Through screening the 54 cpSSR primer pairs with six P. angulata genomic DNA samples, a total of 30 cpSSR primer pairs (55.6%) with good stability, high polymorphism, and clear electrophoresis bands were chosen for further genetic diversity analyses of the P. angulata populations (Table 3).

CpSSR Analysis
A total of 156 alleles were amplified by the 30 cpSSR primer pairs, with a range of two (KzcpSSR19, KzcpSSR21, KzcpSSR25, KzcpSSR37, KzcpSSR38, KzcpSSR39, and KzcpSSR53) to 12 (KzcpSSR04), and an average of 5.2 bands per primer pair ( Table 3). The actual lengths of the amplified products ranged from 100 to 300 bp. Overall, 132 of the 156 amplified alleles were polymorphic, accounting for 84.62%, with an average of 4.4 alleles per pair of primers. The percentage of polymorphic alleles across the primer pairs ranged from 25.0% to 100.0%, with an average of 81.29%. The PIC value of each primer pair ranged from 0.550 to 0.945, with an average of 0.830 (PIC > 0.500). This indicates that these alleles contain abundant genetic information and can be used to study the genetic diversity and genetic structure of P. angulata populations.

Genetic Diversity Analysis
The genetic diversity of the 16 P. angulata populations (n = 203) was evaluated, revealing high mean per population estimates of allele and genetic diversity (Na = 1.3161, Ne = 1.1754, h = 0.1023, and I = 0.1538; see Table 4). The PL of the 16 P. angulata

Genetic Differentiation and Gene Flow
Nei's genetic distance, calculated on the basis of a pairwise comparison, ranged from 0.0329 (between DQ and YW) to 0.8268 (between NH and NJX), with an average of 0.3401. The majority of Nei's genetic identity ranged from 0.4374 (between NH and NJX) to 0.9677 (between TZ and LH), with an average of 0.7386 (Table 5). A Mantel test conducted to assess the correlation between genetic distance and geographical distance among the populations of P. angulata revealed no significant correlation (r = −0.176, p = 0.055). Table 5. Nei's genetic identity (above diagonal) and genetic distance (below diagonal). At the species level, the Ht of the P. angulata population was 0.3224, the Hs was 0.1023, and the Gst was 0.6827 ( Table 6). The results indicated that 68.27% of the genetic variation occurred between P. angulata populations, and 31.73% of the genetic variation occurred within P. angulata populations. The Nm of P. angulata populations was 0.2324, indicating that there was a low level of gene exchange between them. Furthermore, an AMOVA was performed to assess the molecular variations among P. angulata populations ( Table 7). The results of the AMOVA revealed that 29% of total molecular variations were contributed by differences within populations, while the remaining 71% of total molecular variations were due to differences among populations. Population differentiation (PhiPT = 0.706) was significant (p = 0.001), which was consistent with the results from the analysis of Nei's genetic diversity, indicating that the genetic differentiation between populations was higher than that within populations.

Genetic Relationships
The UPGMA dendrogram of P. angulata populations was constructed on the basis of Nei's genetic distance, which is an accurate reflection of the genetic relationships among populations. The UPGMA tree showed that the 16 populations could be divided into four major clusters (Clusters I, II, III, and IV) ( Figure 2). Cluster I contained six populations, namely, HZ, LA, LH, TZ, DQ, and YW. Cluster II included four populations, namely, NJX, XJ, HG, and YN, while population JJ formed a separate cluster (Cluster III). The XS, PJ, NH, WZ, and NJZ populations were separated from the others and collectively grouped into Cluster IV. To verify the results of the UPGMA analysis, an NJ tree of the 203 individuals was constructed on the basis of genetic distance (Figure 3). The results of the NJ tree analysis showed that the 203 P. angulata individuals were grouped into four distinct clusters, consistent with the results of the UPGMA analysis. The findings indicated that most of the individuals from the same population clustered together, but there were also a few individuals from the same population who grouped separately into different clusters. For example, almost all individuals in the XJ population were grouped into Cluster I, and the remaining individuals were grouped into Cluster II and Cluster III (Figure 3). Similar results were also observed in the JJ, YN, and NJX populations. To verify the results of the UPGMA analysis, an NJ tree of the 203 individuals was constructed on the basis of genetic distance (Figure 3). The results of the NJ tree analysis showed that the 203 P. angulata individuals were grouped into four distinct clusters, consistent with the results of the UPGMA analysis. The findings indicated that most of the individuals from the same population clustered together, but there were also a few individuals from the same population who grouped separately into different clusters. For example, almost all individuals in the XJ population were grouped into Cluster I, and the remaining individuals were grouped into Cluster II and Cluster III (Figure 3). Similar results were also observed in the JJ, YN, and NJX populations. PCoAs, based on the genetic distance matrix, were then performed to more intuitively understand the genetic relationship among the 203 samples and 16 populations of P. angulata (Figures 4 and 5). The percentage variance among the 203 samples attributable to the first two principal coordinate axes explained 35.04% and 14.55% of the molecular variance ( Figure 4). Meanwhile, the percentage variance among the 16 populations attributable to the first three principal coordinate axes explained 76.67%, 5.95%, and 3.26% of the total molecular variance ( Figure 5). Interestingly, the results of the two-dimensional PCoA among 203 samples and three-dimensional PCoA among 16 populations were consistent with the results of the UPGMA and NJ clustered tree analysis (Figures 2 and 3). All the samples of P. angulata from 16 populations were classified into four groups. PCoAs, based on the genetic distance matrix, were then performed to more intuitively understand the genetic relationship among the 203 samples and 16 populations of P. angulata (Figures 4 and 5). The percentage variance among the 203 samples attributable to the first two principal coordinate axes explained 35.04% and 14.55% of the molecular variance ( Figure 4). Meanwhile, the percentage variance among the 16 populations attributable to the first three principal coordinate axes explained 76.67%, 5.95%, and 3.26% of the total molecular variance ( Figure 5). Interestingly, the results of the two-dimensional PCoA among 203 samples and three-dimensional PCoA among 16 populations were consistent with the results of the UPGMA and NJ clustered tree analysis (Figures 2 and 3). All the samples of P. angulata from 16 populations were classified into four groups.

Population Structure
Population structure analysis was employed to rebuild the genetic relationship among the 16 P. angulata populations using the newly developed cpSSR markers. The results of the Structure Harvester analysis showed that the most likely value of K in the Bayesian clustering analysis was four ( Figure 6A), indicating the presence of four main groups within the 16 P. angulata populations. The results of the Bayesian clustering analysis using STRUCTURE software ( Figure 6B) confirmed the results obtained from the UP-GMA dendrogram, NJ tree, and PCoA. The first cluster (yellow color) contained the HZ, LA, DQ, YW, LH, and TZ populations. The second cluster (red color) consisted of the XS, PJ, NH, WZ, and NJZ populations. The JJ population alone was placed into the third cluster (green color), while the NJX, HG, XJ, and YN populations were placed into the fourth cluster (blue color). Samples from the same population tended to be concentrated together, indicating that the genetic relationship between samples within a population was closer than the genetic relationship between populations. However, the results also

Population Structure
Population structure analysis was employed to rebuild the genetic relationship among the 16 P. angulata populations using the newly developed cpSSR markers. The results of the Structure Harvester analysis showed that the most likely value of K in the Bayesian clustering analysis was four ( Figure 6A), indicating the presence of four main groups within the 16 P. angulata populations. The results of the Bayesian clustering analysis using STRUCTURE software ( Figure 6B) confirmed the results obtained from the UPGMA dendrogram, NJ tree, and PCoA. The first cluster (yellow color) contained the HZ, LA, DQ, YW, LH, and TZ populations. The second cluster (red color) consisted of the XS, PJ, NH, WZ, and NJZ populations. The JJ population alone was placed into the third cluster (green color), while the NJX, HG, XJ, and YN populations were placed into the fourth cluster (blue color). Samples from the same population tended to be concentrated together, indicating that the genetic relationship between samples within a population was closer than the genetic relationship between populations. However, the results also showed that some individuals were crossed and overlapped among the populations, which indicates that there was some gene exchange and mutual penetration among the populations.
showed that some individuals were crossed and overlapped among the populations, which indicates that there was some gene exchange and mutual penetration among the populations.

Discussion
SSR markers are popular tools for use in DNA marker technology and are extensively applied to the analysis of genetic diversity in plant populations. As a type of SSR marker, cpSSRs have the characteristics of high mutability and high conservation of chloroplast genome sequences. CpSSR markers are simple, efficient, and easy to operate, and they can reveal a high level of population diversity. CpSSRs have been used successfully to reveal genetic diversity among many plants [29][30][31][32]. However, to date, there have been no studies related to P. angulata SSR development, and no DNA marker technique has been employed to analyze the genetic diversity among P. angulata populations. In the present study, we developed a batch of SSR markers and used them to study the genetic diversity among P. angulata populations.
Mononucleotides (68.24% of the total) were the most common SSR loci detected in our study, followed by tetranucleotides (12.28% of the total). These proportions are similar to those of cpSSR marker types found in many plants [33][34][35]. Polymorphism is an important index for evaluating the application value of molecular markers in the study of plant genetic diversity [18,30,36]. The cpSSR markers developed in our study yielded reproducible polymorphic alleles in 203 samples from 16 P. angulata populations. This indicates that these cpSSR markers can be used as a reliable molecular tool for studying genetic diversity and population structure in P. angulata.
A total of 30 out of 54 cpSSR markers amplified electrophoretic bands with high levels of polymorphism, good stability, and high resolution in P. angulata populations, accounting for 55.56% of the newly developed markers. The polymorphic ratio of the cpSSR markers ranged from 25% to 100%, with an average of 81.29%, which was higher than the polymorphic ratios detected by SSR of 73.5% among Dendrocalamus hamiltonii [37], and 53.8% among celery cultivars [38]. The PIC values ranged between 0.550 and 0.945, with an average of 0.830, indicating that the cpSSR markers had good polymorphism and could be used to assess genetic diversity in P. angulata populations.
Genetic diversity is the sum of genetic information in a population, which is necessary for population persistence, adaptation, and evolution [36,39]. The interaction of drift, migration, mutation, and selection is the main factor causing genetic diversity in natural populations [14,40]. Using cpSSR markers, our study indicated that there is considerable

Discussion
SSR markers are popular tools for use in DNA marker technology and are extensively applied to the analysis of genetic diversity in plant populations. As a type of SSR marker, cpSSRs have the characteristics of high mutability and high conservation of chloroplast genome sequences. CpSSR markers are simple, efficient, and easy to operate, and they can reveal a high level of population diversity. CpSSRs have been used successfully to reveal genetic diversity among many plants [29][30][31][32]. However, to date, there have been no studies related to P. angulata SSR development, and no DNA marker technique has been employed to analyze the genetic diversity among P. angulata populations. In the present study, we developed a batch of SSR markers and used them to study the genetic diversity among P. angulata populations.
Mononucleotides (68.24% of the total) were the most common SSR loci detected in our study, followed by tetranucleotides (12.28% of the total). These proportions are similar to those of cpSSR marker types found in many plants [33][34][35]. Polymorphism is an important index for evaluating the application value of molecular markers in the study of plant genetic diversity [18,30,36]. The cpSSR markers developed in our study yielded reproducible polymorphic alleles in 203 samples from 16 P. angulata populations. This indicates that these cpSSR markers can be used as a reliable molecular tool for studying genetic diversity and population structure in P. angulata.
A total of 30 out of 54 cpSSR markers amplified electrophoretic bands with high levels of polymorphism, good stability, and high resolution in P. angulata populations, accounting for 55.56% of the newly developed markers. The polymorphic ratio of the cpSSR markers ranged from 25% to 100%, with an average of 81.29%, which was higher than the polymorphic ratios detected by SSR of 73.5% among Dendrocalamus hamiltonii [37], and 53.8% among celery cultivars [38]. The PIC values ranged between 0.550 and 0.945, with an average of 0.830, indicating that the cpSSR markers had good polymorphism and could be used to assess genetic diversity in P. angulata populations.
Genetic diversity is the sum of genetic information in a population, which is necessary for population persistence, adaptation, and evolution [36,39]. The interaction of drift, migration, mutation, and selection is the main factor causing genetic diversity in natural populations [14,40]. Using cpSSR markers, our study indicated that there is considerable genetic diversity among P. angulate populations (Na = 1.3161, Ne = 1.1754, h = 0.1023, and I = 0.1538). The proportion of polymorphism among P. angulata populations ranged from 6.41% to 54.49%, indicating that there were significant differences in polymorphic loci among populations. Furthermore, the XS population showed greater genetic diversity than the other 15 populations. Correspondingly, the complexity of genetic diversity was as follows: XS > YN > WZ > XJ > PJ > HG > JJ > NJZ > YW > TZ > LA > NH > LH > NJX > HZ > DQ. Compared to other populations, the DQ population from Deqing, Zhejiang Province had the lowest level of genetic diversity, with the smallest number of polymorphic loci, i.e., the lowest Na and Ne values. The reason for the lowest genetic diversity in the DQ population is probably because it contains a limited number of individuals adapted to specific habitats due to their isolation from other populations. Therefore, to more accurately evaluate the genetic diversity of this population in the future, more samples will be required. The average h and I values among the P. angulata populations were 0.1023 and 0.1538, respectively (<0.5), similar to those of P. philadelphica and P. peruviana [27,28,41,42]. The scattered and narrow distribution ranges, as well as small population sizes and large spatial distances between populations, limit pollination between populations, leading to self-and inbreeding. All of these factors may contribute to low genetic diversity.
Gene flow, negatively correlated with the genetic differentiation coefficient, is a basic microevolutionary phenomenon that often hinders genetic differentiation between populations and affects the maintenance of genetic diversity [14,43,44]. In our study, the level of gene flow among P. angulata populations was low (Nm = 0.2324), similar to that found in P. philadelphica, P. peruviana, and M. savatieri [28,41,45]. The main reasons for the low degree of genetic differentiation of among P. angulata populations may be related to the genetic mode and the dispersal distance of the seeds and pollen. The Ht value of P. angulata (0.3224) was higher than that of P. philadelphica (Ht = 0.292) [41]. The Gst value of 0.6827 for P. angulata indicates that 68.27% of the genetic variation occurred among populations, while 31.73% of the genetic variation occurred within populations. This is a similar result to that observed in Rhodiola alsia [46]. The AMOVA results also supported population differentiation, which indicated that the major genetic variance occurred among populations rather than within populations.
The population genetic structure reveals the distribution pattern of genetic diversity within and among populations [14,39,47]. Three methods, UPGMA and NJ clustering, PCoA, and Structure analysis, were combined to detect the genetic diversity and population structure in P. angulata. Many studies have reported a correlation between genetic distance and geographical location in the populations of certain plants [30,39,45,47]. Similar results were also obtained in our study. The UPGMA and NJ results both indicated that the six populations from the Zhejiang region (HZ, LA, LH, TZ, DQ, and YW) were jointly grouped into one cluster. These six populations were geographically close, especially the LH and TZ populations, and the genetic distance between them was also the smallest. The populations that were geographically distant from Zhejiang, such as the NJX, XJ, HG, and YN populations, were clustered together, while the JJ population from Jiangxi was clustered separately. These results indicated that the genetic distance between the P. angulata populations from these provinces and those from Zhejiang Province was relatively large, and the level of gene flow was very low. Interestingly, some populations did not show a clear correlation with geographical location. For example, the NJZ population from Jiangsu Province and four populations (XS, PJ, NH, and WZ) from Zhejiang Province were clustered together, although they were geographically distant. In our previous investigation of morphological phenotypic traits, we found that the plant height in these five populations was higher than that in the other populations. This might explain why they clustered together. Our study showed that the genetic diversity of P. angulata was not only related to geographical location but also likely associated with the geographical environment, human selection, and other factors. Furthermore, the PCoA and Structure analysis results were identical, and both supported the results of the UPGMA and NJ trees analyses. In a future study, more P. angulata population samples and more genome-wide molecular markers (such as SNPs, nSSRs, AFLPs, and others) will be used to more comprehensively evaluate the genetic diversity and structure of P. angulata populations.

Plant Materials and DNA Extraction
In total, 203 samples of wild P. angulata representing 16 populations were randomly collected from their main areas of distribution in China (Table 8  Fresh, young leaf tissues from three individuals of each sample were randomly collected for genomic DNA isolation. Genomic DNA was isolated from these samples as described in our previous studies [1,48]. The DNA quality was evaluated using 1.0% agarose gel electrophoresis, and the DNA quantity was determined using a UV spectrophotometer.

CpSSR Marker Development
In our earlier work, the complete chloroplast genome sequence of P. angulata sequenced, annotated, and submitted to the National Center for Biotechnology I mation (NCBI) GenBank database (GenBank accession no. MH045574) [49]. The SSR distributed throughout the P. angulata chloroplast genome were screened using M software (http://pgrc.ipk-gatersleben.de/misa/; accessed on 20 May 2022) [50]. The c motifs comprised 1-6 nucleotides containing the minimum number of repeats. Ten m

CpSSR Marker Development
In our earlier work, the complete chloroplast genome sequence of P. angulata was sequenced, annotated, and submitted to the National Center for Biotechnology Information (NCBI) GenBank database (GenBank accession no. MH045574) [49]. The SSR loci distributed throughout the P. angulata chloroplast genome were screened using MISA software (http://pgrc.ipk-gatersleben.de/misa/; accessed on 20 May 2022) [50]. The cpSSR motifs comprised 1-6 nucleotides containing the minimum number of repeats. Ten motifs

CpSSR Marker Development
In our earlier work, the complete chloroplast genome sequence of P. angulata was sequenced, annotated, and submitted to the National Center for Biotechnology Information (NCBI) GenBank database (GenBank accession no. MH045574) [49]. The SSR loci distributed throughout the P. angulata chloroplast genome were screened using MISA software (http://pgrc.ipk-gatersleben.de/misa/; accessed on 20 May 2022) [50]. The cpSSR motifs comprised 1-6 nucleotides containing the minimum number of repeats. Ten motifs contained mononucleotide repeats, five contained dinucleotide repeats, four contained trinucleotide repeats, and three each contained tetra-, penta-, and hexanucleotide repeats. Primer Premier 5 software was used to design cpSSR primers, and manual adjustments were made [51]. The parameters for designing the primers were set as follows: the primer length was 18-26 nucleotides, the annealing temperature was 55 • C ± 5 • C, and the amplification product size was 100-300 bp.

Data Analysis
To ensure the accuracy of the results, each pair of primers was used for PCR amplification and electrophoresis detection twice, and only the cpSSR fragments with high definition and good stability were scored. Genetic variation at each locus was characterized in terms of the number of alleles. The PIC was calculated using the following formula: 2q i 2 q j 2 , where n is the number of alleles, q i is the frequency of the i-th allele, and q j is the frequency of the j-th allele [52]. The percentage of polymorphic loci (PPL), the number of observed alleles (Na), the number of effective loci (Ne), Shannon's information diversity index (I), Nei's genetic diversity index (h), and the total genetic variation of the population (Ht) were calculated using PopGene32 Version 1.32 (https://sites.ualberta.ca/~fyeh/popgene_download.html; accessed on 22 August 2022). The genetic variation within the population (Hs), population genetic differentiation coefficient (Gst), and gene flow (Nm) were also calculated using PopGene32 Version 1.32. On the basis of the genetic consistency of the populations, an unweighted group arithmetic mean (UPGMA) cluster diagram and three-dimensional principal coordinate analysis (PCoA), which enable the visualization of genetic variation distribution across populations, were constructed and carried out using NTSYS-PC 2.10e software [53]. On the basis of the genetic distance among the 203 individuals, a neighbor-joining (NJ) cluster diagram was constructed in MEGA X [54]. Analysis of molecular variance (AMOVA) and a two-dimensional principal coordinate analysis (PCoA) across individuals were computed using GenAlEx 6.5 software [55]. The population genetic structure was analyzed using the Bayesian clustering analysis method in STRUCTURE 2.3.4 software [56]. The estimated range of group K values was set to 2-15, and each K value was run 10 times. In addition, the length of burn-in period and MCMC (Markov chain Monte Carlo) parameters were set to 10,000 and 50,000, respectively. Lastly, using Structure Harvester online software (http://taylor0.biology.ucla.edu/structureHarvester; accessed on 22 August 2022), the results of the structure calculation were analyzed to find the best group K value [57,58].

Conclusions
In conclusion, this is the first study to develop a novel set of cpSSR markers and apply them to the investigation of genetic diversity and genetic structure in P. angulata populations. Our study revealed that most of the newly developed cpSSR markers had a high level of stability and polymorphism. The cpSSR analysis confirmed that P. angulata populations contain considerable genetic diversity, and there are high levels of genetic differentiation among populations. The 16 populations of P. angulata were clustered into four groups with some significant geography-related population structure and extensive admixture. Our study demonstrated that cpSSR markers can be used as a powerful tool for evaluating genetic diversity and population structure in P. angulata.
Supplementary Materials: The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/plants12091755/s1: Table S1. Information on 57 SSR motifs detected in the chloroplast genome of P. angulata; Table S2. Information on the 54 cpSSR primer pairs developed in this study.