Assessment of Genetic Diversity of Tea Germplasm for Its Management and Sustainable Use in Korea Genebank

: Tea ( Camellia sinensis (L.) O. Kuntze) is cultivated in many developing Asian, African, and South American countries, and is the most widely consumed beverage in the world. It is of critical importance to understand the genetic diversity and population structure of tea germplasm for e ﬀ ective collection, conservation, and utilization. In this study, 410 tea accessions collected from South Korea were analyzed using 21 simple sequence repeat (SSR) markers. Among 410 tea accessions, 85.4% (350 accessions) were collected from Jeollanam-do. A total of 286 alleles were observed, and the genetic diversity and evenness were estimated to be on average 0.79 and 0.61, respectively, across all the tested samples. Using discriminant analysis of principal components, four clusters were detected in 410 tea accessions. Among them, cluster 1 showed a higher frequency of rare alleles (less than 1%). Using the calculation of the index of association and rbaD value, each cluster showed a clonal mode of reproduction. The result of analysis of molecular variance (AMOVA) showed that most of the variation observed was within populations (99%) rather than among populations (1%). The present study revealed the presence of lower diversity and simpler population structure in Korean tea germplasms. Consequently, more attention should be focused on collecting and conserving the new tea individuals to broaden genetic variation of new cultivars in future breeding of the tea plant.


Introduction
Tea (Camellia sinensis (L.) O. Kuntze, 2n = 2x = 30) is one of the most popular non-alcoholic beverages worldwide, and is consumed by approximately 70% of the world's population for its refreshing taste, attractive aroma, therapeutic uses, and mildly stimulating properties [1]. It is an economically important tree crop, grown in over 52 countries in Asia, Africa, and South America [2,3]. The tea is a woody ever-green perennial plant and recorded to be native to Yunnan and Sichuan provinces in China and the northern part of Myanmar [4]. In Korea, although tea was introduced from China as early as the seventh century, the development of the tea industry was slow, and production was small [5].
The importance of using genetic resources in breeding programs to enhance crop genetic potential has been well recognized [6]. Many germplasm appraisal methods, such as morphology, biochemistry, molecular markers, and sensory evaluation, have been used to evaluate the resources of tea germplasm [7][8][9]. The phenotype can be referred to as a good standard for the evaluation of tea germplasm, because this method is simply based on the morphological traits to analyze the genetic diversity assessment [10]. Recently, the technology of using molecular markers has been proven to be one of the most effective methods for identifying different tea varieties [2,7,[11][12][13][14].
Tea is an out-crossing species, and selected elite genotypes are propagated vegetatively and released as clonal varieties [13,15,16]. Clonal identification is traditionally based on morphological descriptors such as plant shape, stem width, leaf shape, young leaf type, and fruit shape [15,17]. However, as in many out-crossing crops, tea is highly heterozygous with most of its morphological, physiological, and biochemical descriptors showing continuous variation and high plasticity [18,19]. Korir et al. reported that morphological traits are associated with drawbacks such as the influences of environment on trait expressions, epistatic interactions, and pleiotropic effects among others despite the value of their advantages [17]. On the contrary, molecular markers are used as they are least affected by environmental factors and indefinite presence. In addition, they offer a possibility to observe the genome directly and thus eliminate the shortcomings inherent in a phenotype observation [15]. In previous studies, genetic diversity, discrimination and differentiation of tea germplasms have been assessed using different DNA markers such as restriction fragment length polymorphism (RFLP), randomly amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), inter-simple sequence repeat (ISSR), and simple sequence repeat (SSR) [3,5,7,[11][12][13][14][15][20][21][22].
In Korea, the national research institutes collected and conserved excellent tea individuals and investigated their morphological characteristics [23]. Also, some studies analyzed the genetic diversity of Korean tea germplasm using RFLP and RAPD [5,14,16]. However, the analysis of genetic diversity in Korean tea germplasm is not sufficient as it included a very small population (approximately 20-50 individuals). In the present study, 21 SSR primer pairs were used to analyze 410 tea accessions from Korea, and the aim was: (1) to evaluate the genetic diversity and population structure of Korean tea accessions and (2) to estimate the genetic differentiation and variation source among inferred populations. It is hypothesized that the results of the present study would be helpful to gain a deeper understanding on the genetic diversity, population structure, and differentiation of tea germplasm to guide effective collection, conservation, and application of tea genetic resources in Korea.

Plant Materials
A total of 410 tea accessions were obtained from the National Agrobiodiversity Center (NAC) at the Rural Development Administration in South Korea. Among 410 tea accessions, 400 accessions were collected from 31 cities at three provinces in South Korea, while ten accessions lacked the data of collecting area (Table S1).

DNA Extraction
Genomic DNA was extracted from the tea leaves using a Qiagen DNA extraction kit (Qiagen, Hilden, Germany). DNA quality and quantity were measured using 1% (w/v) agarose gel and spectrophotometrically (Epoch, BioTek, Winooski, VT, USA). Extracted DNA was diluted to 30 ng/µL and stored at −20 • C until further PCR amplification.

SSR Genotyping
For SSR analysis, a total of 21 SSRs were fluorescently labeled (6-FAM, HEX and NED) and used for the detection of amplification products (Table 1 and Table S2). PCR reactions were carried out using 25 µL reaction mixture, containing 30 ng template DNA, 1.5 mM MgCl 2 , 0.2 mM of each dNTPs, 0.5 µm of each primer, and 1 U Taq polymerase (Inclone, Korea). The amplification was performed with the cycling conditions of: initial denaturation at 94 • C for 5 min, followed by 35 cycles of denaturation at 95 • C for 30 s, annealing at 55 • C for 30 s, extension at 72 • C for 1 min, and a final extension step at 72 • C for 10 min. Each amplicon was resolved on ABI prism 3500 DNA sequence (ABI3500, Thermo Fisher Scientific Inc., Wilmington, DE, USA) and scored using Gene Mapper Software (Version 4.0, Thermo Fisher Scientific Inc.).

Population Structure and Genetic Diversity
The Number of alleles (Na), Shannon index (I), Nei's unbiased gene diversity (GD), and Eveness were calculated using poppr package for R software [24]. The analysis of molecular variance (AMOVA) and calculation of the coefficient of genetic differentiation among populations (PhiPT) were done using GenAlEx software (6.5 version) with 999 permutations [25].
The population structure was analyzed by a discriminant analysis of principal components (DAPC) using the adegenet package for R software [26,27]. The find.clusters function was used to detect the number of clusters in the population. It uses K-means clustering which decomposes the total variance of a variable into between-group and within-group components. The best number of subpopulations has the lowest associated Bayesian Information Criterion (BIC). A cross-validation function (Xval. dapc) was used to confirm the correct number of principal component (PC) to be retained. In this analysis, the data is divided into two sets: a training set (90% of the data) and a validation set (10% of the data) The member of each group is selected by stratified random sampling, which ensures that at least one member of each group or population in the original data is represented in both training and validation sets. DAPC is carried out on the training set by retaining variable numbers of PCs, and the degree to which the analysis is able to accurately predict the group membership of excluded individuals (those in the validation set) is used to identify the optimal number of PCs to be retained. At each level of PC retention, the sampling and DAPC procedures are repeated many times [28]. The best number of PCs that should be retained is associated with the lowest root mean square error. The resultant clusters were plotted in a scatter plot of the first and second linear discriminants of DAPC.

Estimation of Reproduction Mode among 410 Korean Tea Accessions
Linkage disequilibrium was calculated to test for the evidence of sexual reproduction. Linkage among loci can be caused by clonal reproduction and selection events; and as linkage increases populations fall into linkage disequilibrium, while recombination from sexual reproduction breaks up linkage among loci and generates linkage equilibrium. To quantify linkage, poppr package calculates the indices Ia (The index of association) and rbarD (The standardized index of association). High values of Ia, i.e., values that differ strongly from 0, can be interpreted as evidence of strong linkage and linkage disequilibrium [29]. The rbarD value has been shown to be a more reliable estimator of linkage equilibrium than Ia since it is not influenced by sample size [30], but to be thorough both metrics were calculated. Significance was tested by creating a null dataset (999 random permutations) and if the observed rbarD value lies outside the null dataset then the null hypothesis that no linkage exists would be rejected [24,29].

Population Structure and Mode of Reproduction of 410 Tea Accessions
In order to understand the genetic relationship among 410 tea accessions, DAPC analysis was performed ( Figure 1). Four clusters were detected in coincidence with the lowest BIC value using find.clusters function. DAPC analysis was carried out using the detected number of clusters. Typically, 50 first PCs (68.4% of variance conserved) of PCA and three discriminant eigenvalues were retained. These values were confirmed by a cross-validation analysis. The number of accessions in each cluster was 138, 75, 108, and 88 corresponding to clusters 1 to 4, respectively ( Table 2).
Seventy-six rare alleles, defined with a frequency less than 1%, were observed in 21 SSR markers ( Table 4). All the rare alleles were observed in 119 accessions from four clusters. Typically, 53.9% of rare alleles were observed in 41 accessions from cluster 1, followed by 39.5% in 30 accessions in cluster 2. Unique private alleles were found in 35 accessions and 40% of them belonged to cluster 1.

Gene Flow
The sources of genetic differentiation were revealed among different inferred clusters by the AMOVA method. Results indicated that 1% of variations could be attributed to differentiation among clusters and 99% of variations could be attributed to differentiation within inferred clusters (Table 5). PhiPT and gene flow (Nm) for 410 tea accessions was 0.014 (p < 0.001) and 36.156, respectively. Pairwise population PhiPT values for four clusters ranged from 0.01 (C2-C4) to 0.021 (C2-C3) ( Table 6). Pairwise population estimates of gene flow (Nm) for four clusters ranged from 23.717 to 47.872 migrants per clusters.

Discussion
Erosion of plant genetic diversity is a very serious problem caused by modernization and replacement of wild plants or landraces with a few elite varieties [31,32]. Therefore, collection and preservation of plant genetic resources are of immense importance for crop breeding to support the demands of a growing human population. Effective management and utilization of plant genetic resources require information about the origin of strains, phenotypic traits, and genetic diversity (identified by molecular techniques) [33]. In this study, analysis of genetic diversity of 410 tea accessions collected and conserved in the Korea genebank was performed. Genetic diversity provides an assurance of future genetic progress and insurance against unforeseen threats to agricultural production such as disease epidemics or climate changes. Thus, the fate of genetic diversity in these gene pools is of utmost importance if plant breeding will continue to address the pressing needs of society such as increased yield, genetic resistance to diseases and pests, improved nutritional and processing quality of crop products, and reduction in environmental effects [34].
In the present study, about 85% of tea accessions were collected from Boseong and Suncheon in Jeollanam-do (JN) ( Table 2). According to Eom and Kim, the tea seeds obtained from China were firstly cultivated in Mount Jiri in JN and so a majority of tea plants are included in the Honam region (Jeollanam-do and Jeollabuk-do) [35]. In addition, tea experiment stations in Boseong experiment station (BES) and Mokpo experiment station (MES, a city close to Suncheon) have collected the tea accessions since the late 1990s [18]. The two experiment stations have probably collected tea accessions around the area where the institute is located, and thus the largest number of tea accessions was collected in the region. The tea accessions of the two research institutes have been managed as registered tea germplasms of the NAC and appear to cause a regional collectivity imbalance of tea accessions in Korea.
In this study, the mean Nei's gene diversity (GD, 0.792) across 21 SSR markers was higher than other studies; 0.652 in 280 tea accessions using 23 SSR markers [7], 0.640 in 450 tea accessions using 96 EST-SSR marker [22], 0.543 in 185 Chinese tea cultivars using 48 SSR markers [13], and 0.680 in 64 Sri Lankan tea cultivars using 33 EST-or genomic-SSR markers [3]. The gene diversity of a locus, also known expected heterozygosity, is a fundamental measure of genetic variation in a population, and describes the proportion of heterozygosis expected under Hardy-Weinberg equilibrium [36].
As tea is an open pollinated plant, the tea plant shows highly heterogeneous and consequently broad genetic variation [13]. The obtained results also showed high gene diversity in a manner similar to the previously reported data. However, Yao et al., mentioned that the comparison of the degree of genetic diversity between different studies is difficult as the analysis may be affected by various factors like sampling schemes, number of SSR markers, sizes of SSR repeats, and location of SSR in the genome [22].
Contrary to the result of higher gene diversity, 410 tea accessions in this study were characterized by an extreme dearth of genetic diversity as revealed by an overall Simpson's Dominance (λ) of 0.998. Furthermore, the AMOVA revealed there was no significant difference among populations, suggesting low genetic diversity across the entire collected region. In addition, the standardized index of association (rbarD, 0.0583, (p < 0.001)) supported the hypothesis of clonal population structure based on the linkage disequilibrium tests, where the null hypothesis of random mating was rejected for all populations. Under clonal propagation, heterozygosis and allelic diversity at each locus are expected to increase [37,38]. While high levels of clonality tend to increase genetic variation within the population, an opposite effect is expected on genetic differentiation among populations and on genotypic diversity, both decreasing with the rate of clonal reproduction [37,39]. Indeed, the 410 tea accessions in this study were landraces and are likely to have been collected from private farms. As breeding a reliable cultivar for a private farm is nearly impossible, almost all the tea gardens consist of seedling tea plants from the local and wild origin with great morphological variations [23]. BES and MES also collected and investigated the morphological characteristics of tea germplasm and many variations were observed in the number of stems, stem length, leaf area, and leaf color which were within the limits of the investigation of the morphological characteristics [40]. Due to the lack of sufficient studies on the genetic diversity of Korean tea germplasm, a few researchers argued over the importance of genetic collection and preservation of tea accessions [14,21].
Previous studies performed the analysis of genetic diversity of different tea accessions using molecular markers like RFLP, RAPD, and SSR [3,5,21,22]. In addition, the STRUCTURE software was used to analyze the population structure of tea germplasm [2,3,13,22]. To analyze the genetic diversity and population structure of Korean tea accessions, 21 SSR markers and DAPC analysis were used in this study. The DAPC method provides an interesting alternative to STRUCTURE software as it does not require that populations should be in Hard-Weinberg equilibrium and can handle large sets of data without using parallel processing software [41]. DAPC analysis divided the population into well-defined clusters associated with provenance, ploidy, taxonomy and breeding program of the genotypes and related to their genetic structure [42]. According to Rosyara et al., STRUCTURE, EIGENSTRAT, and DAPC exhibit the ability to control population structure in association with mapping studies [43]. EIGENSTRAT and DAPC were slightly better than STRUCTURE but DAPC led to a better separation among populations. In this study, DAPC (four clusters) analysis provided a more detailed clustering within tea accessions than STRUCTURE (two populations) ( Figure S1). Campoy et al., reported that their results of population structure in sweet cherry using STRUCTURE and DAPC showed good consistency between the two methods and DAPC analysis provided a more detailed clustering among the populations compared to STRUCTURE analysis [41].
As per the result of DAPC, 410 tea accessions were divided into four clusters (Figure 1). Among them, cluster 1 and 2 showed a higher frequency of rare alleles and genetic diversity, and there was high gene flow (Nm = 45.120) between two clusters. Yao et al., reported that a majority of rare SSR alleles and higher diversity were observed in the tea accessions from Yunnan and its neighboring provinces, considered as an original center of the tea plant in China [22]. They also reported that the allele number, genetic diversity, and PIC value of tea germplasm significantly decreased with the distance away from the origin center of the tea plant. Although a particular collection area cannot be designated as an origin, tea accessions contained in cluster 1 and 2 are thought to be the origin of the Korean tea germplasm due to their ratio of rare alleles and higher genetic diversity.
Kaundun et al., reported that tea accessions collected from Korea showed higher genetic diversity than those from Taiwan and Japan [21]. On the other hand, Jeong and Park mentioned that the genetic variation in Korean tea population is smaller compared to Chinese or Japanese wild tea populations [18]. The results of the present study confirmed that 410 tea accessions collected and conserved in Korea genebank exhibit the narrow genetic variations. Park et al., suggested that the low genetic diversity of Korean tea was established from a limited gene stock from China [14]. The short history and relatedly homogeneous environment in which they were introduced in the southwestern part of the country did not favor population differentiation. In addition, loss of diversity was exacerbated by the mass destruction of tea plantations in the fourteenth century due to political and religious reasons [44]. Consequently, tea being a highly outcrossing species, variability is mostly expected within rather than between the populations as predicted by Hamrick [45].

Conclusions
In this study, genetic diversity and population structure of 410 Korean tea accessions collected and conserved in Korea genebank were analyzed using 21 SSR markers. The results provided molecular evidence for the narrow genetic base of the Korean tea accessions. According to the database in NAC, initially, 4223 tea accessions were collected and conserved (http://genebank.rda.go.kr). However, only 510 tea accessions were conserved in NAC as many tea accessions were destroyed by the extreme cold condition in the winter season. Among them, 427 tea accessions were collected from Korea, 56 from China, 22 from Japan, and five from Indonesia. In conclusion, there exists an urgent need for broadening the genetic base of tea accessions in Korea genebank and the necessitation can be achieved by not only collecting tea plants in Korea but also introducing the tea germplasm from other countries. Additionally, it is necessary to analyze the biochemical components of tea accessions in order to gain an understanding of their effects on the quality characteristics of tea varieties and promote utilization of tea germplasm for tea breeding.
Supplementary Materials: The following are available online at http://www.mdpi.com/1999-4907/10/9/780/s1, Table S1: List of 410 tea accessions used in this study; Table S2: Repeat motif and product size of 21 SSR primers used in this study; Table S3: Genetic diversity of 21 SSR markers in each cluster; Figure S1: (A) Relationship between delta K and K as revealed by STRUCTURE harvester. (B) Population structure analysis of 410 tea accessions inferred using STRUCTURE software based on 21 SSR markers for delta K = 2.