Genetic Diversity and Population Structure of the Spring Orchid Cymbidium goeringii in Korean Distant Islands

: The spring orchid ( Cymbidium goeringii ), found in northeast Asia, is one of the most popular and horticulturally important species of the orchid family. This study analyzed the genetic diversity and population structure of the spring orchid populations in the small islands and mainland South Korea using 11 microsatellite markers. The genetic diversities of spring orchid populations in the distant islands (Heuksan Island and Ulleung Island) were slightly lower than that of the mainland population (Yeonggwang-gun). The population structure in the mainland was genetically separated from the populations in the islands. The population of Ulleung Island, located in the eastern part of the Korean peninsula, was genetically closer to the populations from China and Japan than to the populations from Yeonggwang-gun and Heuksan Island, which are geographically close to China. These results imply that the populations of spring orchids distributed in Yeonggwang-gun and Heuksan Island appear not to be inﬂuenced by the yellow dust winds. As the ﬁrst population genetic study of spring orchids distributed in small distant islands, our study will be useful for understanding the genetic diversity and population structure of isolated C. goeringii populations. to obtain the highest amount of variation. This analysis uses a discriminant function that represents a linear combination of correlated alleles in a linear discriminant analysis (LDA), through principal components (PCs) generated after reducing the dimension of genetic variation using principal component analysis [23]. We used this analysis to determine how each population of the spring orchid from the ten sampling sites were distinguishable among each other.


Introduction
The genus Cymbidium (Orchidaceae) of the family Orchidaceae comprises about 70 species with a high taxonomic and morphological diversity and is frequently found in many Asian countries and Australia [1][2][3]. The Cymbidium species usually prefer slightly cool growing conditions, but they are also found in tropic and sub-tropic areas. Cymbidium goeringii Lindley, commonly known as the spring orchid, is frequently distributed in the East Asian countries, such as China, Korea, and Japan. C. goeringii propagates both sexually and asexually. Its cultivars are mostly vegetatively propagated, whereas wild individuals are propagated by sexual reproduction, predominantly via self-pollination [4]. It is one of the most popular and horticulturally important species in the genus Cymbidium, because of its many varieties and characteristic color and shape phenotypes of leaves and flowers. Moreover, spring orchids with attractive phenotypes are collected from field and traded commercially. Hence, it is necessary to develop a conservation plan to protect this species from being over-collected.
Microsatellite markers are frequently used as popular genetic markers in forensic biology and population genetics because the cost of development is low and the functional diversity of the population can be usefully analyzed through the representation of transcribed genes [5]. Among the several genetic markers, plant molecular markers called simple sequence repeats (SSR) have been most frequently applied in phylogenetic analysis and classification of Cymbidium species [6][7][8][9][10][11][12]. Microsatellites can provide information about genetic diversity, inbreeding probability, genetic differentiation, and population structure of C. goeringii [8,11,12]. Hyun et al. (2012) developed 21 polymorphic microsatellites from SSR-enriched genomic libraries of C. goeringii and applied them in the analysis of four populations (East Korea, West Korea, China, and Japan) [8]. Lee et al. (2020) determined the sequences of 13 microsatellites for molecular authentication of ten commercially high-priced cultivars of Korean C. goeringii with characteristic flower phenotypes [12].
In spring, yellow dust wind (also called yellow sand wind), which is one of the main types of westerly current, blows strongly from China to Korea. It was proposed that yellow dust contains pollutants of heavy metals, viruses, bacteria, and fungi [13]. Yellow dust rises in the air from the deserts of China or Mongolia and is carried by westerlies, thus influencing the west coastal area more than the east coastal area in Korea. When wild C. goeringii varieties with rare phenotypes are found in the west coastal area of Korea, orchid cultivators or collectors tend to believe that they originated from Chinese seeds by the yellow dust wind. Very tiny C. goeringii seeds with a length of about 1.2-1.8 mm could easily be blown from burst pods by the wind during spring. However, no study has yet proposed any evidence about the settlement of C. goeringii seeds blown from China.
Few studies have been performed on the population genetic diversity of C. goeringii in small islands far from the mainland. In the present study, we investigated the genetic diversity and population structure of C. goeringii populations collected from small islands far from the mainland via analyses of 11 microsatellite loci. In particular, to estimate the effect of yellow dust on the C. goeringii populations in Korea, Ulleung-Island and Heuksan-Island, respectively located to the east and west of Korea, were selected as the survey sites. We also compared the genetic structure and origin of C. goeringii distributed in Korea, China, and Japan.

Field Investigation
A total of 268 leaf and root samples were collected and analyzed to compare their genetic diversity and confirm their genetic structure. Of these 268 samples, 104 samples were collected from three sites in Yeonggwang-gun (YG) in mainland South Korea, between March 2018 and July 2020, 68 samples were collected from three sites in Ulleung Island (UL) during April 2019, and 96 samples were obtained from two sites in Heuksan Island (HS) during February 2019 ( Figure 1). HS in the West Sea and UL in the East Sea are about 100 and 130 km away from the mainland, respectively. The spring orchid samples were transferred to a laboratory and immediately stored at −30°C.

DNA Purification and Microsatellite Genotyping
The collected leaf or root plant tissues were ground into a fine powder under liquid nitrogen using a mortar and pestle or disrupted using a TissueLyser II (Qiagen, Hilden, Germany). Genomic DNA was extracted from the leaf and root powder using a DNeasy plant DNA isolation kit, following the manufacturer's protocol (Qiagen). The concentration and quality of the obtained DNA were determined using NanoDrop 2000 (Thermo Fisher Scientific, Wilmington, NC, USA). Eleven microsatellites (CG415, CG649, CG709, CG722, CG1023, CG1028, CG1085, CG1210, CG1281, CG1320, and CG1400) were coamplified using the primer mixtures reported by Hyun et al. (2012) and Lee et al. (2020) [8,12]. Multiplex PCRs were performed using the method described in Hyun et al. (2012) [8]. The reaction mixture for PCR had a total volume of 10 µL, including 30 ng of genomic DNA, 1 µL of primer mixture with variable concentrations of primer pairs, and 5 µL of 2X PCR Master solution of the MultiMAX PCR kit (Intron Bio, Sungnam, Korea). Amplification was performed using 35 cycles of 20 s at 95 • C, 1 min at 58 • C, and 1 min at 72 • C using the SimpliAmp Thermal Cycler (Applied Biosystems, Foster City, CA, USA). We visualized the amplicons using the Seq-Studio Genetic Analyzer (Thermo Fisher-Applied Biosystems, Foster City, CA, USA), and evaluated the dataset for genotype errors and the presence of null alleles using GeneMapper version 6.1 (Thermo Fisher-Applied Biosystems).

Determination of Genetic Diversity and Genetic Distance
The genetic diversity and genetic distance were determined from the genotype dataset. Deviations from the Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) of 11 microsatellite loci in 268 spring orchid samples were assessed using GENEPOP version 4.7 [14]. Null alleles were not present in our dataset. In all three groups (YG, UL, and HS), the loci with significant deviations from the HWE and the loci with evidence of linkage equilibrium with all loci were excluded from the analyses. We used Excel with the add-in GenAlEx version 6.5 [15] and Arlequin version 3.5 [16] to calculate the genetic diversity, diversity indices, and genetic distance of the samples. The calculated genetic diversity attributes and indices were as follows: mean number of alleles (N A ), effective number of alleles (N E ), observed (H O ) and expected (H E ) heterozygosities, Shannon's information index (I), molecular diversity (h), and inbreeding coefficient relative to the subpopulation (F IS ) of all populations.
Microsatellite genotypes of Henan samples from China (15 samples, CH) and Gunma samples from Japan (eight samples, JP) were obtained from previous data reported in Hyun et al. (2012) [8], to compare the genetic distance and population structure among the five populations (YG, UL, HS, CH, and JP). The number of multilocus genotypes showed that 11 loci are sufficient for distinguishing 291 samples, including Chinese and Japanese samples ( Figure 2). The unweighted pair group method with arithmetic mean (UPGMA) was used to construct a dendrogram with genetic distances, which were Diversity 2020, 12, 486 4 of 10 obtained by GenAlEx version 6.5 using the Past 3 software [17]. The population differentiation (F ST ) and significant F ST p-value were analyzed in all populations to compare the genetic distance among populations using Arlequin version 3.5.
Microsatellite genotypes of Henan samples from China (15 samples, CH) and Gunma samples from Japan (eight samples, JP) were obtained from previous data reported in Hyun et al. (2012) [8], to compare the genetic distance and population structure among the five populations (YG, UL, HS, CH, and JP). The number of multilocus genotypes showed that 11 loci are sufficient for distinguishing 291 samples, including Chinese and Japanese samples ( Figure 2). The unweighted pair group method with arithmetic mean (UPGMA) was used to construct a dendrogram with genetic distances, which were obtained by GenAlEx version 6.5 using the Past 3 software [17]. The population differentiation (FST) and significant FST p-value were analyzed in all populations to compare the genetic distance among populations using Arlequin version 3.5.

Determination of Population Structure
STRUCTURE version 2.3.4. [18] with an admixture model was used to perform Bayesian clustering. This analysis allowed us to infer whether the ancestors in a population (K) have passed a portion of their genetic material to an individual (i). Each analysis consisted of 100,000 simulations after burn-in of 100,000 simulations. The ΔK method [19] in STRUCTURE Harvester [20] was used to identify the most likely K value. The range of 1 to 11 possible clusters with three independent runs each were used in STRUCTURE Harvester. Discriminant analysis of the principal components (DAPC) [21] was used to represent the identified population clusters using the "adegenet" package in R [22] to obtain the highest amount of variation. This analysis uses a discriminant function that represents a linear combination of correlated alleles in a linear discriminant analysis (LDA), through principal components (PCs) generated after reducing the dimension of genetic variation using principal component analysis [23]. We used this analysis to determine how each population of the spring orchid from the ten sampling sites were distinguishable among each other.

Comparison of Genetic Diversity among Spring Orchids from Three Regions
NA across all 11 loci and all 268 samples was 6.318 (ranging from 3.636 to 9.818; UL1 to YG2), overall NE was 3.143 (ranging from 2.400 to 3.594; UL1 to YG2), overall HO was 0.537 (ranging from 0.389 to 0.641; UL2 to YG1), overall HE was 0.625 (ranging from 0.542 to 0.689; UL1 to YG2), overall I Figure 2. Relationship between the number of multilocus genotypes and the number of loci in 291 individuals of the spring orchid. When the number of loci was seven or higher, the 291 individuals could be 100% separated.

Determination of Population Structure
STRUCTURE version 2.3.4. [18] with an admixture model was used to perform Bayesian clustering. This analysis allowed us to infer whether the ancestors in a population (K) have passed a portion of their genetic material to an individual (i). Each analysis consisted of 100,000 simulations after burn-in of 100,000 simulations. The ∆K method [19] in STRUCTURE Harvester [20] was used to identify the most likely K value. The range of 1 to 11 possible clusters with three independent runs each were used in STRUCTURE Harvester. Discriminant analysis of the principal components (DAPC) [21] was used to represent the identified population clusters using the "adegenet" package in R [22] to obtain the highest amount of variation. This analysis uses a discriminant function that represents a linear combination of correlated alleles in a linear discriminant analysis (LDA), through principal components (PCs) generated after reducing the dimension of genetic variation using principal component analysis [23]. We used this analysis to determine how each population of the spring orchid from the ten sampling sites were distinguishable among each other.

Comparison of Genetic Diversity among Spring Orchids from Three Regions
N A across all 11 loci and all 268 samples was 6.318 (ranging from 3.636 to 9.818; UL1 to YG2), overall N E was 3.143 (ranging from 2.400 to 3.594; UL1 to YG2), overall HO was 0.537 (ranging from 0.389 to 0.641; UL2 to YG1), overall H E was 0.625 (ranging from 0.542 to 0.689; UL1 to YG2), overall I was 1.282 (ranging from 0.957 to 1.550; UL1 to YG2), overall FIS was 0.140 (ranging from −0.060 to 0.362; UL1 to UL2), and overall h was 0.639 (ranging from 0.560 to 0.711; UL3 to YG1).
To compare genetic diversity among groups, we compared the overall values of I, FIS, and h of each site with the overall values of I, F IS , and h of all samples. Compared to the overall I across all loci and samples, the overall I in YG (1.459) was higher, whereas the overall I in UL (1.163) and HS (1.195) were lower. Compared to the overall F IS across all loci and samples, the overall F IS in YG (0.086) was lower, whereas the overall F IS in UL (0.190) and HS (0.148) was higher than the overall F IS across all loci and samples. Compared to the overall h across all loci and samples, the overall h in YG (0.687) was Diversity 2020, 12, 486 5 of 10 higher, whereas the overall h in UL (0.619) and HS (0.596) was higher than the overall h across all loci and samples (Table 1).

Pairwise Comparison of the Population Differentiation in Five Sites
The F ST of spring orchids in the five groups were ranged from 0.001 (YG2 to YG3) to 0.332 (UL1 to HS1). The F ST values were not significant between groups YG1 to YG2 (p = 0.153) and between groups YG2 to YG3 (p = 0.540).
The F ST between YG and HS groups was lower than the F ST between UL and YG groups and UL and HS groups. The F ST between the China and the YG groups was lower than the F ST between the China and UL and China and HS. The F ST between the Japan and HS groups was higher than the F ST between Japan and YG, the F ST between Japan and UL ( Figure 3).

Analysis of Population Structure from Five Sites
In the STRUCTURE Harvester analysis, the best-supported K value among the samples from the five sites (YG, UL, HS, CH and JP) was determined to be 2 ( Figure 4). The UPGMA tree provided the evidence that the spring orchids in each group of the five sites (YG, UL, HS, CH, and JP) were clearly

Analysis of Population Structure from Five Sites
In the STRUCTURE Harvester analysis, the best-supported K value among the samples from the five sites (YG, UL, HS, CH and JP) was determined to be 2 ( Figure 4). The UPGMA tree provided the evidence that the spring orchids in each group of the five sites (YG, UL, HS, CH, and JP) were clearly separated (Figure 5a). In STRUCTURE analysis, the spring orchids of the five sites were divided into two K clusters. YG and HS groups were determined to be derived from the same common ancestor K, and UL was identified to be derived from the same K cluster as CH and JP (Figure 5b).

Analysis of Population Structure from Five Sites
In the STRUCTURE Harvester analysis, the best-supported K value among the samples from the five sites (YG, UL, HS, CH and JP) was determined to be 2 ( Figure 4). The UPGMA tree provided the evidence that the spring orchids in each group of the five sites (YG, UL, HS, CH, and JP) were clearly separated (Figure 5a). In STRUCTURE analysis, the spring orchids of the five sites were divided into two K clusters. YG and HS groups were determined to be derived from the same common ancestor K, and UL was identified to be derived from the same K cluster as CH and JP (Figure 5b).  With DAPC, discriminant function 1 (DF1) explained 52.06%, and DF2 explained 18.04% of the total genetic variation in spring orchids from the sampling sites. DF 1 supported the idea that the spring orchids of the groups YG and HS were assigned to the same cluster and were separated from spring orchids of the group UL. DF 1 also indicated that the spring orchids of China and Japan were assigned to the same cluster and were separated from the three other groups (YG, UL, and HS). DF2 showed that the spring orchids of the groups YG, HS, and China were assigned to the same cluster and were separated from the spring orchids of the groups UL and Japan. DF 2 indicated that the spring orchids of the group UL3 were assigned the same cluster as the spring orchids of Japan, whereas they were separated from the spring orchids of the groups UL1 and UL2 (Figure 6a). The fine-scale structures detected through DAPC separated the spring orchids from the three sites of UL, and samples of YG and HS were assigned to the same cluster. In the case of China and Japan, each group was identified as an independent cluster (Figure 6b). With DAPC, discriminant function 1 (DF1) explained 52.06%, and DF2 explained 18.04% of the total genetic variation in spring orchids from the sampling sites. DF 1 supported the idea that the spring orchids of the groups YG and HS were assigned to the same cluster and were separated from spring orchids of the group UL. DF 1 also indicated that the spring orchids of China and Japan were assigned to the same cluster and were separated from the three other groups (YG, UL, and HS). DF2 showed that the spring orchids of the groups YG, HS, and China were assigned to the same cluster and were separated from the spring orchids of the groups UL and Japan. DF 2 indicated that the spring orchids of the group UL3 were assigned the same cluster as the spring orchids of Japan, whereas they were separated from the spring orchids of the groups UL1 and UL2 (Figure 6a). The fine-scale structures detected through DAPC separated the spring orchids from the three sites of UL, and samples of YG and HS were assigned to the same cluster. In the case of China and Japan, each group was identified as an independent cluster (Figure 6b). (a) Discriminant function 1 (DF1) explained 52.06% and DF2 explained 18.04% of the genetic variation in spring orchids from ten sites. Each node represents the genotype of a spring orchid connected to a centroid, was assigned based on the clustering of the DAPC scores. The dotted line represents the spanning tree among spring orchids from ten sites. (b) Membership probability of DAPC determined that the sampled individuals were optimally clustered into seven groups, whereas the sampling sites of YG and HS were clustered into each two group.

Discussion
This study determined the genetic diversity, inbreeding coefficient, population structure, and clustering of three C. goeringii populations in South Korea through genetic polymorphic analysis of 11 microsatellites. Spring orchids from YG had relatively high genetic diversity (I and h) and low inbreeding coefficient (F IS ). The genetic diversity of spring orchids from UL and HS were similar, whereas F IS was remarkably higher on UL than that on HS. Bayesian (STRUCTURE) and multivariate (DAPC) clustering methods, in addition to genetic distances (pairwise F ST ), indicated that the genetic structure of spring orchid populations from YG and HS were similar, whereas the populations from UL were independent.
Despite the high inbreeding levels of the small isolated island spring orchid populations with limited migratory flow, relatively high genetic diversity and low inbreeding coefficient was observed in the inland population of YG, which is located on the mainland, and in this population, only one locus deviated from the HWE. In general, sample size can affect the levels of alleles, such as Ho, He, and allele per locus [24], suggesting that inbreeding coefficient and genetic diversity can be affected, but this pattern was not seen in in our results (Table 1). In our study, the sample sizes for all populations were mostly a sample size [24] known to be stable (20 or more) than the generally described small sample sizes (5-10).
If spring orchid seeds from China were directly introduced into Korea by the yellow dust, they would occur at a much higher frequency on HS, which is closer to China, than on UL, which is far from China and has a physical barrier in the form of the Taebaek Mountains, the longest mountain range in Korea ( Figure 1). However, when comparing the mean population genetic data between the populations from UL and HS, the genetic diversity of the spring orchids in these two populations was very similar. These results suggest that the inflow of spring orchids in HS, which is close to China, is very limited, and its closed population is maintained similar to that on UL. Moreover, the inbreeding coefficient of the population on UL was higher than that of the population on HS. Many studies have reported that the inbreeding rate increases with an increase in the level of habitat fragmentation [25][26][27][28]. Moreover, the over-collection of C. goeringii in the field is negatively correlated with genetic differentiation and kinship [29]. This species usually inhabits relatively low hills or mountains in East Asian countries [1,30,31]. Although the area of UL (72.9 km 2 ) is larger than that of HS (20.0 km 2 ), UL has a high-altitude forest in the middle of the island, which is not the case on HS (Figure 1). These geographic features seem to limit the living space of spring orchids on these islands, thereby increasing inbreeding.
The STRUCTURE analysis, DAPC, and pairwise-FST showed that the population of UL has developed independently from the populations in YG and HS. STRUCTURE analysis indicated that the population of UL was genetically separated from the populations of YG and HS, but was closely linked to the populations of China and Japan. DAPC and pairwise-F ST also showed that the population of UL was genetically closer to the populations of China and Japan than to the populations in YG and HS from Korea. This result implied that the population of spring orchids inhabiting YG and HS may have not originated from China by yellow dust winds.
As the first population genetic study of spring orchids distributed in the small distant islands, our study suggests that the genetic diversities of small isolated populations are slightly lower than that of the inland population and are genetically separated from the inland population. In addition, this study concluded that the contribution of yellow dust to the spring orchid distribution in the west of Korea was absent or very weak. This study will be useful for understanding the genetic diversity and population structure of isolated populations, and for the conservation of C. goeringii.