Genetic Diversity and Association Analysis among Germplasms of Diospyros kaki in Zhejiang Province Based on SSR Markers

In subtropical to temperate regions, persimmon (Diospyros kaki Thunb.) is an economically important fruit crop cultivated for its edible fruits. Persimmons are distributed abundantly and widely in Zhejiang Province, representing a valuable resource for the breeding of new cultivars and studying the origin and evolution of persimmon. In this study, we elucidated the genetic structures and diversity patterns of 179 persimmon germplasms from 16 different ecologic populations in Zhejiang Province based on the analysis of 17 SSR markers. The results show that there was a medium degree of genetic diversity for persimmon found in Zhejiang Province. With the exception of the Tiantai Mountain and Xin’an River populations, we found extensive gene exchange had occurred among the other populations. The 179 D. kaki germplasms from the 16 populations could be separated into three distinct clusters (I, II, and III) with a higher mean pairwise genetic differentiation index (FST) (0.2714). Nearly all samples of Cluster-I were distributed inland. Cluster-II and Cluster-III contained samples that were widely distributed throughout Zhejiang Province including all samples from the coastal populations and the Northeast Plain populations. In addition, we performed association mapping with nine traits (fruit crude fiber content, fruit calcium content, fruit water content, fruit longitudinal diameter, fruit aspect ratio, seed width, seed length, leaf aspect ratio, and number of lateral veins) using these markers. This led to the identification of 13 significant marker–trait associations (MTAs; p < 0.00044, 0.1/228) using a general linear model, of which, six MTAs with a correlation coefficient (R2) >10% were consistently represented in the general linear model with p < 0.00044 in the two models. The genetic structures and diversity patterns of the persimmon germplasms revealed in this study will provide a reference for the efficient conservation and further utilization of persimmon germplasms. The MTAs identified in this study will be useful for future marker-assisted breeding of persimmon.


Introduction
Persimmon (Diospyros kaki Thunb.) is a widespread temperate and subtropical species of the genus Diospyros belonging to the Ebenaceae family [1]. Persimmon was first cultivated in eastern China between the first to second centuries, corresponding to about 2000 years of planting history. So far, persimmon has been one of the most important fruit crops in China, Japan, and South Korea due to its deciduous fruit [2,3]. Approximately 1000 varieties of D. kaki have been found in China. The abundant diversity of the persimmon germplasm represents a valuable resource for the breeding of new cultivars and studying the origin and evolution of persimmon [4,5].
Zhejiang Province is located in southeast China with a long history of persimmon cultivation. The previous evaluation of phenotypic traits demonstrated that Diospyros germplasm resources are abundant and widely distributed in Zhejiang Province, and there were rich variations of agronomic traits [13][14][15], which may contain an abundance of desired genes/alleles such as for stress resistance and association with excellent flavor as a result of unique climate and diverse ecology. So far, little research has focused on evaluating and mining the genetic structures, diversity patterns, and desired genes/alleles of these sources, particularly through the use of codominant molecular markers.
The identification of loci influencing agronomic traits is the bases of marker-assisted breeding, which can effectively improve breeding efficiency [16]. Association mapping (AM) is a faster and more efficient approach for the high-resolution evaluation of complex traits and appears as a promising tool to circumvent the limitations of linkage mapping [16][17][18]. AM can effectively be used to associate genotypes with phenotypes in natural populations and identify marker-trait associations for several agronomic traits in various plant species [16][17][18][19][20][21][22][23][24]. However, due to the lack of persimmon genomic information, only a few molecular markers have been identified as linked to natural astringency loss [25][26][27] or sexual traits [28,29], which implies perhaps quality inheritance. Association studies of other quantitative traits have not been reported.
Of the many available DNA markers, microsatellites, also known as simple sequence repeat (SSR) markers, are an appropriate choice in association analysis compared to other types of molecular markers as they possess more characteristics that are advantageous for this type of application [18][19][20][21][22] such as random and wide distribution in the genome, which is genetically codominant, highly reproducible, multi-allelic, and perfectly suitable for high-throughput genotyping [18][19][20][21][22][30][31][32]. Microsatellites have been successfully used for plant crop and tree characterization and studies of genetic diversity and gene association analysis [18][19][20][21][22].
In the present study, we elucidated the genetic structures and diversity patterns of 179 persimmon sources in Zhejiang using SSR markers. We performed association mapping and identified markers associated with eleven traits of agronomic importance to determine marker-trait associations (MTAs). We explored the genetic diversity and marker-trait associations of the Zhejiang persimmon sources, which will be useful for future evolution studies and marker-assisted breeding of persimmon.

Plant Materials
We collected 179 semi-wild Diospyros kaki germplasms from 16 different eco-regions in Zhejiang Province from 2014 to 2015. These samples originated from the local sampling site. Each plant was collected at least 30 m away from its nearest neighbor. The 16 eco-regions is according to the long-term habitual classification of terrain in Zhejiang Province based on geographical and environmental parameters ( Table 1). The climatic conditions for each collection site were estimated by the Zhejiang Climate Data 2019 [33]. We calculated the following environmental factors; detailed information of the 179 sampled individuals are listed in Table S1.
All 179 D. kaki germplasms were grafted and kept during the spring of 2015 in the Plant Nursery of Lanxi City in Zhejiang Province (29 • 25 N, 119 • 51 E). All stocks were "YLNO.6", a superior line selecting from Diospyros kaki var. Silvestris. There were less differences both in the terrain and environment of the Plant Nursery. Each germplasm contained six plants at a spacing of 3 m × 4 m. The trial layout was a randomized block design with six replicates of single-tree plots. Persimmon tree management followed the standard of LY/T 1887-2010 [34]. All germplasms were in a stable fruiting period during the autumn of 2019. Six to eight normal leaves and fruits of every plant were collected, in total about 40 leaves and fruits of every germplasm, to measure six agronomic traits (fruit longitudinal diameter (FLD), fruit aspect ratio (FAR), seed width (SW), seed length (SL), leaf aspect ratio (LAR), and number of lateral veins (NLV)) of persimmon. The 6-8 normal fruits of every plant were pooled as one repetition to measure three agronomic traits (fruit crude fiber content (FCF), fruit calcium content (FCC), and fruit water content (FWC)). The correlation between traits was assessed by the Spearman correlation test using R software [35]. The mean values (Table S2) of these agronomic traits were used for the association study.

Simple Sequence Repeat (SSR) Genotyping
Genomic DNA was extracted from selected tender leaf tissue using an improved Cetyltrimethylammonium Bromide (CTAB) method [36] and the quality assessed using 1% agarose gel electrophoresis. The concentration was then determined using ultraviolet spectrophotometry.
The 17 microsatellite markers were selected from various sources [37][38][39] (Table S3). The amplification of the SSR reaction was carried out in an ABI Applied Biosystems Polymerase Chain Reaction (PCR) machine. The 10 µL reaction volume contained 50-100 ng total genomic DNA, 50 pmol primers, 0.25 U Taq DNA polymerase (Promega, Sunnyvale, CA, USA), 10× buffer, 200 µmolL −1 dNTP, and 2 mmolL −1 Mg 2+ . The PCR conditions were as follows: predenaturation at 94 • C for 3 min followed by 30 cycles of PCR amplification. Each cycle consisted of denaturation at 94 • C for 30 s, annealing for 30 s using the designated temperature depending on the primer pair, and extension at 72 • C for 1 min. The PCR products were finally extended at 72 • C for 5 min and then stored at 4 • C. The PCR products were resolved on 8% denaturing polyacrylamide gels, which were silver stained to detect SSR bands [35]. A 100 bp DNA ladder was included for the estimation of allele sizes. When estimating allele doses in polyploids, the situation is more complicated because there may be genotypic ambiguity even for codominant markers [40]. In addition we still do not know if the persimmon is allohexaploid or autohexaploid, or whether  [25]. Therefore, all alleles of the SSR markers were regarded as the dominant markers according to previous research of polyploids such as potato (Solanum tuberosum L.) (tetraploid) [41], cotton (Gossypium hirsutum L.) (allotetraploid) [42], wheat (Triticum aestivum L.) (hexaploid) [43], persimmon (hexaploid) [12], and sugarcane (Saccharum officinarum L.) (octoploid) [44,45]. The result obtained at the DKMP8 marker is displayed in Figure 1. SSR data were scored visually for allele size and presence or absence of each primer. SSR bands were scored as a qualitative character (e.g., present (1) or absent (0)) to create a binary matrix. All genotypic data can be found in Table S4. Next, the phenotypes of the 0/1 matrix were used for genetic diversity assessments and association analyses following the analytical methods of dominant markers.
Each cycle consisted of denaturation at 94 °C for 30 s, annealing for 30 s using the designated temperature depending on the primer pair, and extension at 72 °C for 1 min. The PCR products were finally extended at 72 °C for 5 min and then stored at 4 °C. The PCR products were resolved on 8% denaturing polyacrylamide gels, which were silver stained to detect SSR bands [35]. A 100 bp DNA ladder was included for the estimation of allele sizes. When estimating allele doses in polyploids, the situation is more complicated because there may be genotypic ambiguity even for codominant markers [40]. In addition we still do not know if the persimmon is allohexaploid or autohexaploid, or whether different germplasms are mostly like different types of hexaploidy [25]. Therefore, all alleles of the SSR markers were regarded as the dominant markers according to previous research of polyploids such as potato (Solanum tuberosum L.) (tetraploid) [41], cotton (Gossypium hirsutum L.) (allotetraploid) [42], wheat (Triticum aestivum L.) (hexaploid) [43], persimmon (hexaploid) [12], and sugarcane (Saccharum officinarum L.) (octoploid) [44,45]. The result obtained at the DKMP8 marker is displayed in Figure 1. SSR data were scored visually for allele size and presence or absence of each primer. SSR bands were scored as a qualitative character (e.g., present (1) or absent (0)) to create a binary matrix. All genotypic data can be found in Table S4. Next, the phenotypes of the 0/1 matrix were used for genetic diversity assessments and association analyses following the analytical methods of dominant markers.

Data Analysis
POPGENE 1.31 (University of Alberta, Edmonton, Canada) [46] was used to evaluate the following genetic diversity parameters: the allele frequency (MAF), number of alleles (Na) and effective number of alleles (Ne) per locus, heterozygosity observed (Ho), expected heterozygosity (He), Nei's gene diversity index (Nei), and Shannon's diversity index (I). The polymorphic information content (PIC) was calculated for each locus using the online program PICcalc [47]. The formula of main genetic variation parameters is listed as follows. The information of allele frequencies is shown in Table S5.
where Common_allele is the number of highest genotypes detected at a site, and n is the total number of genotypes detected at a site.

Data Analysis
POPGENE 1.31 (University of Alberta, Edmonton, Canada) [46] was used to evaluate the following genetic diversity parameters: the allele frequency (MAF), number of alleles (Na) and effective number of alleles (Ne) per locus, heterozygosity observed (Ho), expected heterozygosity (He), Nei's gene diversity index (Nei), and Shannon's diversity index (I). The polymorphic information content (PIC) was calculated for each locus using the online program PICcalc [47]. The formula of main genetic variation parameters is listed as follows. The information of allele frequencies is shown in Table S5.
where Common_allele is the number of highest genotypes detected at a site, and n is the total number of genotypes detected at a site.
where i is the number of distinct alleles at a locus and P i (i = 1,2, 3, . . . , I) is the frequency of allelei in the population. where N is the number of samples.
where P i is the frequency of the ith allele.
GenAlEx v.6.5 was used to determine the average pairwise genetic differentiation index (FST) between populations [48]. The analysis of molecular variance (AMOVA) was used to partition the genetic variance and was carried out using Arlequin 3.11 [49]. Genetic relationships between the sampled germplasms were elucidated through the construction of an unrooted neighbor joining (NJ) dendrogram based on simple matching coefficients using MEGA5 [50]. A bootstrap value of 1000 replicates was used to test the reliability of the NJ dendrogram. Principal component analysis (PCA) was performed using EIGENSOFT [51]. To summarize the patterns of variation in the multi locus data set, principal coordinate analysis (PCoA) was performed by GenAlEx version 6.5 based on the pairwise FST matrix. The isolation-by-distance pattern (IBD) was detected by Mantel tests [52] with 1000 permutations based on matrices of pairwise genetic distance (FST/1-FST) and geographic distance among populations performed in GenAlEx 6.5.
The population genetic structure was studied using the Bayesian clustering method implemented in STRUCTURE version 2.3.4 [53]. To determine the optimum number of subgroups (K), three independent runs were performed to estimate each K value from 1 to 10. For each run, a burn-in length of 5000 followed by 50,000 Markov chain Monte Carlo (MCMC) iterations were performed assuming an admixture model with default settings and allele frequencies. Structure Harvester [54] was used to visualize the best K value based on delta K (∆K) and maximum log likelihood L (K). Repeated sampling analysis and genetic structural plots were analyzed by CLUMPP [55]. POPHelper [56] was used to render the histogram base on the K value matrix after merging.
The associations between SSR markers and traits were tested using TASSEL 3.0 software [57]. Association mapping analysis was conducted using TASSEL through the general linear model (GLM) with the Q matrix as well as the mixed linear model (MLM) with the Q matrix and kinship (K matrix) procedures. The significant threshold for the association was set at different levels p < 0.00043 (0.01/number of loci). The phenotypic variation explained by each marker-trait association was studied using the correlation coefficient (R 2 ).

Characterization of Morphological Traits Persimmon Germplasms
The range and mean values of the nine morphological traits of the persimmon germplasms are provided in Table 2. All traits showed broad ranges across the germplasm accessions. Eight morphological traits had about two to six times difference (maximum/minimum). The highest FCR was approximately sixteen times the lowest. There were many different responses of correlations among the morphological traits and environmental factors (Table 3). There were no significant relationships between SumPR, SumMT, WinMT, altitude, and morphological traits. SprPR, WinPR, SumSR, AutSR, and SprMT showed negative correlation to more than one morphological trait. AutPR, longitude, WinSR, and AutMT showed a positive correlation to the morphological traits. Latitude showed a negative correlation to FCC and FCR, and positive correlation to LAR. Perhaps because of the influences of environmental factors on phenotypes, the morphological traits of the accessions were significantly different by geographical area (Table 2). FCC and FWC were both significantly high in germplasms from the Eastern Coastal Plain (e.g., Pop 15 and Pop 3). The lowest FCR, FCC, and SW were in both germplasms from Pop 8. These fruit and seed shape indices (e.g., FLD, FAR, and SL) had similar geographical distribution patterns that were higher in eastern coastal areas (e.g., Pop 3 and Pop 13), and lower in Pop 10 and the JingQu Basin overall (e.g., Pop 12).
The correlation among the nine agronomic traits was also analyzed. The pairwise correlation coefficients among fruit aspect ratio and fruit longitudinal diameter, fruit longitudinal diameter, fruit aspect ratio, and seed length were greater than 0.5, indicating strong associations (correlations) between each pair of these traits (Table S6). Note: Different lowercase letters for each parameter indicate significant differences among populations according to two-way analysis of variance (ANOVA) followed by Duncan multiple-range test (α = 0.05); * represents the minimum value of morphological traits in all 179 persimmon germplasms; # represents the maximum value of morphological traits in all 179 persimmon germplasms. Note: * represents significant difference at p < 0.05; ** represents significant difference at p < 0.01.

Characterization of Microsatellite Marker Polymorphism
A total of 228 polymorphic alleles with high polymorphic rates were generated for 179 D. kaki using 17 SSR markers (Table S7). The number of alleles per locus ranged from seven (MDP21) to 22 (DKMP13), with an average of 13.41 alleles per locus. Due to the hexaploid nature of the D. kaki genome, other parameters (e.g., the expected heterozygosity) were not calculated. The high polymorphic levels of the examined SSR markers are reflected by the PIC values for each marker, which ranged from 0.8566 to 0.9437; all were above 0.5 with an average of 0.8965 alleles per marker. We found 25 unique alleles for five markers at eight loci. Two unique alleles (Mdp16-150 and Mdp16-158) were only found in Pop 2 (Hanning4). One unique allele (Mdp8-284) were only found in Pop 7 (ChunAn8). As hexaploid species, high Na and Ho were observed within each accession, which ranged from 67 to 132 and 0.6471 to 1, respectively (Table S8). In addition, we detected significant relationships between Na and environmental factors (such as SprPR, AutPR, WinPR, SumSR, AutSR, WinSR, SprMT, longitude, and altitude). However, Ho was not related (Table S9). We observed that the Na of Mdp21 showed significant relationships only with altitude; another fifteen of these markers showed significant relationships with more than two factors. Therefore, these SSR markers were suitable for studying the genetic diversity of persimmon germplasms. We further analyzed the correlation between pairwise FST and geographical distance ( Figure 2) using the Mantel test. The results showed that there was a significant correlation between genetic distance and geographic distance (r = 0.117, p = 0.010), indicating a clear geographic origin-based structuring or predominate isolation by distance among the investigated germplasms.

Establishment of Genetic Information in Germplasms Populations
The genetic diversity of 179 genotypes of D. kaki sampled from 16 areas was analyzed and the results are shown in Table 4

Establishment of Genetic Information in Germplasms Populations
The genetic diversity of 179 genotypes of D. kaki sampled from 16 areas was analyzed and the results are shown in Table 4

Establishment of Genetic Information in Clusters
Using the SSR genotypic data, an unrooted neighbor joining (NJ) dendrogram (Figure 4)   The differentiation among the 16 populations was measured by pairwise FST (Table 5).
High FST values (>0.25), which are representative of a very high level of genetic differentiation as suggested by Wright [58], were mainly found between Pop 6 and other populations, and between Pop 8 and other populations. The highest pairwise FST value of 0.497 was found between Pop 6 and Pop 8 and indicates strong genetic differentiation. Other pairwise FST values between populations were nearly all lower than 0.25, indicating that the weak genetic differentiation among these populations may be due to existing extensive gene exchange.

Establishment of Genetic Information in Clusters
Using the SSR genotypic data, an unrooted neighbor joining (NJ) dendrogram ( Principal component analysis (PCA) and STRUCTURE analysis were also used to evaluate the relationships among germplasms. In the PCA ( Figure 5), all germplasms were more clearly divided into three groups than the NJ classification. STRUCTURE analysis also indicated the optimal cluster number as K = 3 based on delta K (Figures 6 and 7). However, the membership of the STRUCTURE and PCA clusters corresponded closely with those delineated by the NJ dendrogram. The slight difference in group 1 and group 2 by the NJ dendrogram corresponded to one cluster of STRUCTURE and PCA. Therefore, these germplasms were grouped into three clusters based on the comprehensive results of the NJ dendrogram, PCA, and STRUCTURE analyses. All persimmon germplasms, labeled in blue, red, and black, according to the three gene pools determined by STRUCTURE analyses, were then positioned on a map based on their origins. As Figure 8 shows, germplasms of the same group were mostly admixtures of different clusters (e.g., the samples of Pop 6 were divided into Cluster-I and Cluster-II, while Pop 5 consisted of samples of Cluster-I and Cluster-II). Notably, the classification pattern of the clusters was consistent with the geographical distribution. Nearly all samples of Cluster-I were those distributed inland while samples of Cluster-II and Cluster-III were mainly from coastal areas.  Principal component analysis (PCA) and STRUCTURE analysis were also used to evaluate the relationships among germplasms. In the PCA ( Figure 5), all germplasms were more clearly divided into three groups than the NJ classification. STRUCTURE analysis also indicated the optimal cluster number as K = 3 based on delta K (Figures 6 and 7). However, the membership of the STRUCTURE and PCA clusters corresponded closely with those delineated by the NJ dendrogram. The slight difference in group 1 and group 2 by the NJ dendrogram corresponded to one cluster of STRUCTURE and PCA. Therefore, these germplasms were grouped into three clusters based on the comprehensive results of the NJ dendrogram, PCA, and STRUCTURE analyses. All persimmon germplasms, labeled in blue, red, and black, according to the three gene pools determined by STRUC-TURE analyses, were then positioned on a map based on their origins. As Figure 8 shows, germplasms of the same group were mostly admixtures of different clusters (e.g., the samples of Pop 6 were divided into Cluster-I and Cluster-II, while Pop 5 consisted of samples of Cluster-I and Cluster-II). Notably, the classification pattern of the clusters was consistent with the geographical distribution. Nearly all samples of Cluster-I were those distributed inland while samples of Cluster-II and Cluster-III were mainly from coastal areas.   (Table 6), which demonstrated considerable diversity among the populations. The high pairwise FST value (Table S10) between clusters indicated that each cluster was clearly differentiated from the others. The percentage of variation also increased from 9.87% among populations (Table S11) to 26.23% among clusters (Table S12) (Table 6), which demonstrated considerable diversity among the populations. The high pairwise FST value (Table S10) between clusters indicated that each cluster was clearly differentiated from the others. The percentage of variation also increased from 9.87% among populations (Table S11) to 26.23% among clusters (Table S12). All clusters contained private alleles, with four (DK26-152, DK26-126, Mdp8-200 and Mdp8-224) found in Cluster-I, four (Mdp16-158, Mdp16-150, DKMP13-212, and DKMP13-230) in Cluster-II, two (Mdp8-284 and DKMP15-500) in Cluster-III. While, four, five, twelve alleles were not found in Cluster-I, Cluster-II, and Cluster-III, respectively (Figure 9).

Association Mapping
We identified 18 significant marker-trait associations (p < 0.00044, 0.1/228) for the nine traits using GLM and MLM models in association mapping. A greater number of significant marker-trait associations were observed using GLM (13) compared to MLM (5).
Six of the detected MTAs exhibited R 2 ≥ 10% in the general linear model or p < 0.00044 in the two models and were retained for further analysis ( Table 7). As an exception, we included the significant MTAs for loci DKMP17-282 associated with FCR, DK11-297 and DK11-271 associated with NLV both in GLM and MLM analysis with R 2 values ranging from 9% to 9.2%, 7.63% to 7.73%, 6.87% to 8.48%, 9.48% to 9.51%, and 10.18% to 10.19%, respectively. Notably, the SSR loci MDP18-310 was associated with both SL and FAR, with R 2 ranging from 10.01% to 13.53% in the GLM models. A significant MTA for DKMP18-156 with FCC was also detected using GLM analysis with R 2 as 10.19%.     Figure 9. Distribution of alleles among clusters per locus. The pie chart with different color parts represents the different allele numbers of each locus among clusters, and the green pie represents zero allele of the loci existing in the cluster, the red pie represents one allele of the loci existing in the cluster, and the yellow pie represents more than one allele of the loci existing in the cluster.

Association Mapping
We identified 18 significant marker-trait associations (p < 0.00044, 0.1/228) for the nine traits using GLM and MLM models in association mapping. A greater number of significant marker-trait associations were observed using GLM (13) compared to MLM (5).
Six of the detected MTAs exhibited R 2 ≥ 10% in the general linear model or p < 0.00044 in the two models and were retained for further analysis ( Table 7). As an exception, we included the significant MTAs for loci DKMP17-282 associated with FCR, DK11-297 and DK11-271 associated with NLV both in GLM and MLM analysis with R 2 values ranging Figure 9. Distribution of alleles among clusters per locus. The pie chart with different color parts represents the different allele numbers of each locus among clusters, and the green pie represents zero allele of the loci existing in the cluster, the red pie represents one allele of the loci existing in the cluster, and the yellow pie represents more than one allele of the loci existing in the cluster.

Discussion
Plant germplasms are valuable resources that can be used for the breeding of new cultivars as well as evolution studies. Persimmon is abundantly and widely distributed in Zhejiang Province [13,14], which may represent an abundant variability resource comprising desired genes/alleles due to unique climate and diverse ecology. In the study, all morphological traits of the persimmon germplasms showed broad ranges, indicating that the association panel includes diverse germplasms that can be used in breeding programs. Therefore, the elucidation of genetic structures and diversity patterns of these sources and the identification markers associated with agronomic traits using 17 SSR markers was performed in this study.
In accordance with the present results, previous studies [12,13,15] have demonstrated that there are many differences in the morphological traits of persimmon germplasm from different eco-regions population, while few analyses have looked into the relationship between morphological traits and environmental factors in persimmon and similar species. In this study, mean temperature of the quarter showed little significance in morphological traits, perhaps due to all sampling sites belonging to the subtropical monsoon climate and no strict temperature limit for persimmon survival. Consistent with previous studies of many fruit trees, which revealed that light [59][60][61] and water supply [62][63][64][65][66] played an important cue in both leaf and fruit development, and precipitation and solar radiation were the main climate factors affecting the morphological traits of persimmon in Zhejiang Province. There were also longitudinal differences in morphological trait diversity. These can be attributed to the comprehensive climatic and terrain factors at the establishment site.
The studied SSR markers exhibited a high level of genetic diversity. The mean PIC values were 0.8965 (i.e., all were above 0.5), the threshold indicating the existence of a very high level of marker polymorphism as suggested by Botstein [67]. The mean PIC value was much higher than those of available reports in Luo within a few germplasms [68], and comparable to 0.72 and 0.827 in the evaluation reports of 228 persimmon accessions through 12 gSSR markers [12] and 71 persimmon accessions through 19 gSSR markers [69], respectively, although this may be due to the diverse accessions in the present and other studies. The abundance of genetic information as demonstrated by the highly polymorphic SSRs will be helpful in understanding the genetic structure of the persimmon population in our study.
Compared to the nationwide accessions or germplasms, the population mean value of I (0.3694) from the persimmon sources in Zhejiang Province were lower than the I (1.6014) in 133 accessions originating from China, Japan, and America using 17 neutral SSR markers in Liang's report [11]. The diversity of persimmon sources in Zhejiang Province was higher than in the evaluation of 189 D. kaki germplasms in Guangxi Province through 13 gSSR markers, with a higher mean value of Nei (0.3694) and I (0.3694) in this study compared with the Nei (0.149) and I (0.231) in Guangxi Province [70]. This indicates that persimmon sources were distributed differently in the two provinces, which may have been caused by differences in the terrain and persimmon source evolutionary history or different SSR markers used in these studies.
In addition, we detected significant relationships between Na and environmental factors. These results seem to be consistent with other research that found AFLP loci in European beech (Fagus sylvatica L.) [71] and European white birch (Betula pendula Roth.) [72], SSR loci in Eucalyptus gomphocephala DC. [73] and American chestnut (Castanea dentata (Marshall) Borkh.) [74]. Among them, the most highly correlated climatic factor was WinPR, WinSR, and SprMT, while SumMT and SumPR did not show any significance. This result indicates that the climate conditions in winter were more correlated to genetic traits than those in summer. Specifically, less precipitation and more solar radiation in the winter were indicative of a greater number of alleles within a population. SprMT was also a close factor affecting genetic parameters. A possible explanation for this might be that a cold spell in later spring is a serious obstacle for persimmon leaf germination. The distribution of persimmon resources might result from the natural selection and adaptation to the local climate, which may be one of the major causes of the longitudinal gradient in genetic diversity. A significant correlation between genetic distance and geographic distance was detected by the Mantel test. Additionally, a clear geographic origin-based population structuring was performed through comparing the genetic parameters and morphological traits among the population. For example, coastal populations, with low polymorphism levels but high fruit calcium content, fruit water content, and fruit and seed shape indices, may be marginal populations according to the central-marginal hypothesis that marginal populations show less genetic variation and higher differentiation than central popula-tions [75][76][77]. The formation reason may partly be explained as these populations being located in the seaside, and these high mountains in eastern coastal areas also dampen the genetic exchange between coastal populations and inland populations, which was also supported by the low pairwise FST values. Meanwhile, there was extensive genetic exchange among populations that originated inland or in a neighboring province though the long history of persimmon cultivation. The similar geographic patterns of gene exchange have also been found in China fir (Cunninghamia lanceolata (Lamb.) Hook.) [78], Sakhalin fir (Abies sachalinensis (F.Schmidt) Mast.) [79], and Mongolian oak (Quercus mongolica Fisch. ex Ledeb.) [80].
The 179 D. kaki germplasms from the 16 populations were broken up into three distinct clusters (I, II, and III) based on the comprehensive results of the NJ dendrogram, PCA and STRUCTURE, with a higher mean FST (0.2714). Nearly all samples of Cluster-I were distributed inland, possibly due to a lack of adaptation genes (such as resistance to salinity or to higher groundwater levels). The samples from coastal populations were mostly classified as Cluster-II and Cluster-III. The samples from Cluster-II and Cluster-III were widely distributed throughout Zhejiang Province. There may still be less gene (or germplasm) exchange between these coastal populations and inland populations. These germplasms from Cluster-I may be eliminated by coastal climate and environmental conditions (e.g., many typhoons, highly alkaline soils), and samples from the coastal region may contain advantageous genes that allow for survival under a wide range of environments. Although the formation mechanism of the distribution and genetic structure characteristics of persimmon resources in Zhejiang Province require further research, the information in the germplasms suggests that a gene locus of differential adaptability in a corresponding cluster will be an important resource for future environmental adaptation research in persimmon.
Identification of the loci influencing agronomic traits is the basis of marker-assisted breeding. However, due to a lack of genomic information, association studies of other quantitative traits have not been reported in persimmon. In the present study, we performed association mapping and identified markers associated with nine traits of agronomic importance. Improvement in fruit taste was the main breeding target of persimmon fruit. The fruit crude fiber content and calcium content were both closely related to fruit taste [81]. The DKMP17-282 loci was strongly correlated with FCR, as supported by a high correlation coefficient in both GLM and MLM analysis (R 2 ranging from 9% to 9.2%). The DKMP18-156 (R 2 = 10.19%) loci were identified as strongly correlated with the FCC in the GLM analysis. Hence, these SSR locus can be used in molecular-assisted breeding of persimmon fruit to improve their taste.
Among horticultural crops, fruit size and shape are also important considerations for consumers. However, little has been done regarding studies on persimmon fruit shape due to the difficulty of quantitatively characterizing fruit shape features. By using association mapping in this study, we identified loci DK26-184 and MDP18-310 as linked with the FLD and FAR in persimmon, respectively. Among these loci, MDP18-310 (R 2 = 7.39-13.53%) was consistently associated with the FAR in both models. In the present study, strong positive correlations between the shape traits of fruit and seed, with higher pairwise correlation coefficients and some geographic distribution pattern, consistent with previous observations in F1 progenies [82] and germplasm resources [15,83]. These results might be explained in part by immature seeds producing a mass of a gibberellin-like substance to promote fruit development [84]. Intriguingly, among the four MTAs of seed width or length, the loci MDP18-310, associated with the FAR, was also associated with SL in GLM analysis (R 2 = 10.01%). Our results further support Haruka's description [85] that variations in fruit and seed shapes among cultivars are coordinately controlled, and that MDP18-310 may be located in the genome region for controlling both these traits.
Due to the complexity of the observed quantitative traits, these MTAs explain a small portion of phenotypic variance (about 10%) consistent with previous association mapping studies [86][87][88][89] while, the breeding of quantitative traits in plants usually utilized the significant association combinations by cumulative genetic effects. In this study, the loci were both strongly correlated with the NLV, which was significantly associated with two MTAs (DK11-297 and DK11-271) with additive effects, which could add up to explain 19.7% of phenotypic variation without considering the intermarker effect.
Although these significant marker-trait associations have still not been documented sufficiently due to lack of information on genomic location and gene function of these SSRs, our study will provide a reference for a genome-wide association study (GWAS) in persimmon. These MTAs will be useful for the future marker-assisted breeding of persimmon.

Conclusions
In this work, large-scale evaluation of persimmon germplasm in Zhejiang Province based on SSR markers was carried out for the first time. The results showed that there was a medium degree of genetic diversity found for persimmon in Zhejiang Province. The 179 D. kaki germplasms from the 16 populations could be separated into three clusters with distinct geographical pattern distribution. These provide a good basis for further research on the genetic basis of adaptation in D. kaki, and therefore the selection of suitable environment-marker associations for further breeding. In our study, 13 significant markertrait associations were identified through the general linear model with nine traits of agronomic importance, of which, six MTAs with correlation coefficients (R 2 ) > 10% were consistently represented in the general linear model or p < 0.00044 both in the general linear model and mixed linear model. As the first association mapping in D. kaki, the study will provide a reference for large scale MTA identification though a genome-wide association study (GWAS) in persimmon. The potential marker-trait associations identified in this study will be useful in facilitating marker-assisted breeding for D. kaki improvement and identification of candidate genes for trait variability in D. kaki.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/f12040422/s1, Table S1: Environmental factors, population, group, and cluster information of the 179 sampled D. kaki germplasms; Table S2: The statistics of the nine agronomic traits of the 179 D. kaki germplasms; Table S3: Characteristics of the 17 SSR markers used in this study; Table S4: All genotypic data across 179 D. kaki germplasms by 17 SSR markers; Table S5: The information of allele frequencies; Table S6: Correlation analysis between morphological traits for the 179 persimmon germplasms; Table S7: Genetic diversity of the 17 SSR markers in 179 persimmon germplasms; Table  S8: The Na and Ho within each accession; Table S9: Correlation analysis between heterozygosity observed, number of alleles, and environmental factors for the 179 persimmon germplasms; Table  S10: Pairwise FST estimates for groups resulting from the cluster analyses; Table S11: The molecular analyses of variance (AMOVA) among the germplasms according to origin of the populations; Table  S12: Molecular analyses of variance (AMOVA) among the persimmon germplasm clusters.
Author Contributions: Y.X. and B.G. conceived and designed the study; Y.X., B.G., X.J., and K.W. organized and conducted the trial sampling and managed the samples; Y.X., W.C., and C.X. carried out the laboratory analyses; Y.X. and W.C. collated, managed, and analyzed data; Y.X. and B.G. wrote the paper. All authors have read and agreed to the published version of the manuscript.