Genetic Diversity and Primary Core Collection Construction of Turnip ( Brassica rapa L. ssp. rapifera Matzg) Landraces in Tibet Revealed via Morphological and SSR Markers

: Turnip, one of the oldest groups of cultivated Brassica rapa species, is a traditional crop as well as a form of animal fodder, a vegetable, and a herbal medicine that is widely cultivated in farming and farming-pastoral regions in Tibet. Different regions of the Qinghai–Tibet Plateau (QTP) are home to a rich diversity of turnip owing to their high altitudes and variable climate types. However, information on the morphology and genetic diversity of Tibetan turnip remains limited. Therefore, the genetic diversity of 171 turnip varieties from China and elsewhere (Japan, Korea, and Europe) was analyzed using 58 morphological characteristics and 31 simple sequence repeat (SSR) markers in this study. The varieties showed that the genetic distance ranged from 0.12 to 1.00, and the genetic similarity coefﬁcient ranged between 0.73 and 0.95. Cluster tree showed two distinct clusters. Both morphotype and geography contributed to the group classiﬁcation. A combination of morphological traits and molecular markers could reﬁne the precision of accurate identiﬁcation compared to the separate morphological and molecular data analyses. The sampling ratio of 15% to utmost precisely represent the initial population was compared to ratios of 10% and 20%, and the sampling ratio of 15% is recommended for future works when a primary core collection of turnip resources is constructed. These results could furnish a foundation for germplasm conservation and effective turnip breeding in future studies.


Introduction
Turnip (Brassica rapa syn. B. campestris L. ssp. rapifera Matzg), belonging to the Brassica genus, has been an important global crop for centuries, first being cultivated in China in 2500 B.C. It is known by many Chinese names, such as Feng, Manjing, Yuancaitou, Yuangen, Pancai, Buliuke, Jiuyingsong, Yuanluobo, and Zhugecai [1]. As a B. rapa subspecies, this turnip group represents one of the oldest groups of cultivated B. rapa type [2]. In the Tibet Autonomous Region of China, turnip is called "Niuma" and is widely used as a vegetable, fodder, oilseed, traditional Tibetan medicine, and as raw material in butter lamps [3,4]. Additionlly, it is considered an ideal cash crop for cold areas at high altitudes area due to its relatively short growth period and its high resistance to barren conditions, and cold climates.
The Tibet Autonomous Region (TAR), located in southwestern China, comprises the main section of the Qinghai-Tibet Plateau (QTP), which has an average elevation of 4000 m above sea level. The annual temperature ranges from-0.5 • C to 3 • C, a range suitable for turnip growth [5]. Turnips are widely cultivated in agricultural areas and farming-pastoral regions in the TAR, and its vertical altitude distribution could reach 4300 m [6]. The typical environmental characteristics of the QTP, include high elevation, high ultraviolet radiation intensities, complex terrain, and diverse climate conditions, render long-term artificial selection plant genes prone to mutations leading to variations [7]. Meanwhile, status of the underdeveloped transportation infrastructure in the region prevents the Tibetan turnip from coming into contact with external turnip germplasms, preserving primitive germplasm groups. The traditional independent family farm mode and the typical geographic origin result in high diversity among these individual turnip landraces [8]. Beyond this, the QTP occupies an important position as the world's earliest and largest agricultural and cultivated plant origin center [9]. Many of the locally cultivated and semi-wild species of Brassica have been collected in the southeast part of Tibet [10]. Wang et al. proposed that Tibet's central and east zones comprise two distribution areas of wild rapeseed and dozens of wild rapeseed populations on the Tibetan Plateau have been found, suggesting that Tibet Plateau is one of the origin centers of Brassica rapeseed globally [11].
Despite being an ancient and well-known plant species, turnip cultivation is decreasing in most countries. The QTP is Asian's water tower and plays an essential role in global climate regulation [12]. The acceleration of urbanization and changing modern agricultural production methods are both underway in Tibet [13]. Many related factors have led to the decline in turnip landraces in this area, e.g., during the implementation of the Green for Grain Project on QTP, in which the plantation areas of crops is reducing gradually, and forage crops, excluding turnip, are significantly increasing in recent years [14]. Moreover, Tibetans have begun using cultivate turnip of F 1 hybrid seeds at scale, which may lead to a significant decrease in, or even a loss of, the special landraces of turnip in Tibet [11]. The assessment of genetic diversity and resource collection is a prerequisite for efficient organization, conservation, and utilization of turnip and other Brassica species improvement, new cultivar deployment, and hybrid seed production [15][16][17]. However, reports about the genetic diversity and collection of turnip resources from Tibet are limited thus far. Therefore, collecting and assessing "Niuma" resources and constructing a Tibetan turnip landrace bank are urgent for the diversity retention of turnip landraces in Tibet.
Genetic diversity analysis is the basis of variety resource evaluation, utilization, and preservation, as well as breeding new varieties [18]. Many studies on the genetic diversity of turnip have been conducted following morphological characteristics or limited molecular markers, revealing that turnip (Brassica rapa) has a high genetic variation [1,19]. However, morphological descriptions vary significantly based on environmental factors, e.g., farming practices, age, and the developmental stages of plants [20]. In recent decades, different types of molecular markers, including restriction fragment length polymorphisms (RFLP), amplified fragment length polymorphisms (AFLP), random amplified polymorphic DNA (RAPD), simple sequence repeats (SSR), and single nucleotide polymorphism (SNP), have been applied to evaluate the genetic diversity of Brassica, particularly turnip-type rapeseed, in different regions by many researchers [2,[21][22][23]. Among these markers, the development of SSR technology has been widely used for analyzing genetic diversity in plant species due to its low cost and simple use to reveal the genetic relationships among different varieties [1,17,24,25].
This study aimed to determine the genetic diversity, the relationship, and the genetic structure of turnip varieties from a broad range of geographic regions in the QTP using SSR markers and morphological data. Then, a representative sampling ratio when constructing a primary core collection of turnip was calculated. This will provide a basis for protecting Tibetan turnip landraces and production applications.

Plant Materials and Experimental Design
A total of 171 varieties were collected, including 118 turnip landraces from the central and eastern zones (two distribution areas of wild Brassica) of the Tibet Autonomous Region (from 26 • 50 N, 78 • 25 E to 36 • 53 N, 99 • 06 E), 23 varieties from different provinces of China, and 30 varieties from Europe, Japan, and Korea ( Figure 1). The materials included 169 turnips and two Chinese cabbages (B. campestris L. ssp. pekinensis) and they were numbered 177-01 to 177-171 (Table S1). All varieties were planted in the greenhouse for DNA extraction and morphological investigation. The experiment was carried out from October to August of next year, 2017 to 2019, at the experimental station of Zhejiang University located in Wuxi, Jiangsu Province (from 31 • 07 N, 119 • 33 E to 32 • 02 N, 120 • 38 E). Varieties were initially sown in the seeding tray containing nutrient soil (1-2 seeds per hole), and seedlings at the 6-8 true leaves stages were transplanted into a greenhouse. About thirty plants of each variety with uniform growth were selected and arranged in a randomized complete block design with their own labels. Agronomic practices including irrigation, fertilization, and deworming were conducted uniformly as required in all blocks during the culture.

Investigation and Statistics of Morphological Characteristics
A total of 58 morphological traits including 33 quantitative and 25 qualitative traits related to cotyledon, leaf, tuber, and flower were measured at different stages (Table S2) [26]. Ten plants of each variety with uniform growth were used for the investigation.
The qualitative traits were evaluated visually and converted to standard data as shown in Table S2 following the description of Descriptors and Data Standard for Root and Stem Mustard (B. juncea Coss.) [5,26]. The quantitative traits (length, width, diameter, and weight) were measured with a ruler, a caliper, and an electronic balance. All raw data were recorded in a Microsoft Excel 2016 (Microsoft, Washington, DC, USA) worksheet, and mean values, as well as standard deviations, of the raw quantitative traits data were obtained. The Shannon-Weaver (H') values of qualitative traits were calculated based on the distribution frequency of each grade. Data on quantitative traits were divided into 10 grades according to the mean (χ) and the standard deviation (σ), e.g., 1 grade X i < χ − 2σ, 10 grade X i > χ + 2σ, with a 0.5σ difference between each adjacent grade as described by Wang et al. [27]. The formula for calculating the H' was as follows: H = − ∑ P i ln P i , where i is the grade, and P i indicates the probability of the i grade in each trait.
To make the data effective and sensible and to facilitate comparisons between them, the variables were standardized (Z-score) by SPSS v 19.0 (SPSS, Chicago, IL, USA). This makes the data unitless, expressed only in terms of deviation from an index of centrality (e.g., the mean or the median). Then, principal component analysis was performed for 53 morphological traits in the same software. Finally, a morphological dendrogram was constructed using the clustering method of Euclidean distance metrics with average connections among groups based on the principal component analysis by SPSS v 19.0 (SPSS, Chicago, IL, USA) [5].

DNA Extraction and SSR Analysis
Young, expanding leaves from at least three plants in each variety were bulked in the seedling stage for DNA extraction. The method of DNA extraction and polymerase chain reaction (PCR) analysis followed the methods of Gao et al. [1,28]. The quality of the DNA template was detected using a UV spectrophotometer and agarose gel electrophoresis.
A set of 31 SSR loci with clearly distinguishable polymorphic bands and stable reproducibility in the pre-experiment were selected to genotype all varieties (Table S3). These markers were developed by Wang (2012) [29] based on the whole genomic sequence of B. rapa (A01-A10, Accessions Codes: DDBJ/GenBank/EMBL_AENI00000000 AENI01000000) [30] and have been used previously in genetic diversity analysis of turnip [5].
PCR products were separated via 12% non-denatured polyacrylamide gel electrophoresis (Native-PAGE) and visualized using the silver staining method as reported by Gao et al. [1,31]. All major DNA band sizes were detected using a 100-1000 bp DNA ladder as the reference. Clearly distinguishable polymorphic single bands ranging from 50 bp-500 bp were scored as present (1) or absent (0) and recorded into an Excel worksheet to form a 0/1 data matrix following the procedures described previously [1,2,5]. Heterozygous genotypes were not used.

Data Analysis
Genetic statistics including major allele frequency, the number of alleles, genetic diversity, and the polymorphic information content (PIC) of each microsatellite locus were calculated in PowerMarker v 3.25 [32]. Band frequency, Shannon's Information Index (I), expected heterozygosity (He), and unbiased expected heterozygosity (uHe) were calculated using GenAlex 6.51 [33]. Pairwise genetic similarity coefficients were calculated using the Dice coefficient by the SIMQUAL program of NTSYS2.10e [34]. Afterwards, a dendrogram for SSR markers was developed based on the genetic similarity matrix through the unweighted pair group method with an arithmetic mean (UPGMA) algorithm and generated by the SHAN cluster analysis of the NTSYS-pc. Cophenetic correlation values between the original genetic similarity coefficient matrices and the cophenetic value matrices given by the UPGMA clustering process were calculated to support the clustering by the Mantel test [34]. The correlation of the similarity coefficient matrices between the morphological traits and the SSR markers was also revealed using the same procedure. Moreover, a joint analysis based on combined data of morphological traits and SSR markers was also conducted to verify the precision of clustering compared to the separate morphological and molecular data analyses.
The population genetic structure of 171 turnips was evaluated by a Bayesian clustering algorithm implemented in the software STRUCTURE v.2.3.4 [35]. Parameters were set to their default values as advised by Evanno et al. [36]. Ten independent runs were conducted with a burn-in period of 100,000 iterations followed by 100,000 Markov chain Monte Carlo (MCMC) repetitions for each value of K (range 1-10) [16,36,37]. The optimum number of subpopulations (K) was estimated by plotting the estimated likelihood values and the Ln P (D) and calculating the delta (∆K) model, which was based on the rate of change in the log probability between successive K values developed by Evanno et al. [36]. Structure Harvester [38] (http://taylor0.biology.ucla.edu/struct_harvest/ accessed on 1 September 2021) was used to compare and visualize the likelihood values across multiple values of K for determining the most probable value of K as well as the Q-matrices data. Repeat sampling analysis for a given K was performed by the CLUMPP1.1.2b [39]. The individual and population Q-matrices were computed as the mean overall individual and population Q-matrices after columns were aligned according to the permutation with the greatest H-value. The varieties with membership probabilities greater than or equal to 0.50 were considered to belong to same individual cluster [37,40]. Principal component analysis (PCA) based on SSR markers was conducted using Past 3.26 software [41] to verify the results obtained with STRUCTURE.

Construction of Primary Core Collection in Tibetan Turnip
Data of morphological traits and SSR molecular markers combined with geographical origins were considered for the primary core collection. The three steps of construction were as executed. (1) Three primary core collections corresponding to different scales were constructed to identify the optimal core collection ratio (10-20%). Based on the dendrogram of clusters, the preferred sampling strategy combined with stepwise clustering was used to construct core collections. (2) The evaluation of the primary core collection. Four evaluation parameters for morphological traits were selected [42], namely, the mean difference percentage (MD), the variance difference percentage (VD), the coincidence rate of range (CR), and the change rate of the variation coefficient (VR), for testing the diversity and representativeness of the primary core collection using SPSS v19.0 (SPSS, Chicago, IL, USA). The formulas for calculating the parameters were as described by Hu et al. [42]. Additionally, the mean values of major allele frequency, the number of alleles (Na), gene diversity, PIC, the effective number of alleles (Ne), the expected heterozygosity (He), the unbiased expected heterozygosity (uHe), and Shannon's information index (I) between the initial population and the primary core collection based on SSR molecular data were estimated using PowerMarker v3.25 [32] and GenAlex 6.51 [33]. (3) PCA and correlation analyses based on morphological data were used to assess the primary core collection, which effectively eliminated genetic redundancy.

Analysis of Morphological Characteristics
Certain variation degrees were detected among the tested materials. The range of the coefficient of variation was 0-77%. Variations were extremely low (CV = 0,H' = 0) for traits in cotyledon color, leaf type, leaf wax powder, tuber shoulder shape, leaf vein freshness, leaf vein gloss, germination ability of lateral bud, and bolting ability. Traits on the leaf including leaf shape, leaf surface, leaf color, and leaf fissure showed a relatively high similarity (Table S4). However, other morphological traits like hypocotyl color, scar on tuber shoulder, skin color of tuber shoot, exterior color of above-ground tuber and lateral tuber distribution, cotyledon length, plant breadth, petiole length, single plant weight, tuber weight, fleshy tuber width and length, sepal length, and morphological characteristics of floral organs showed high variations. Plant height, plant breadth, tuber weight, sepal width, petal length, short filament length, anther length, and style length showed relatively high variations and diversity (Table S5 and Figure S1). Variation in the color of the tuber shoot skin was the largest, whereas variations in some of the morphological indices of the leaves were smallest. These results indicated an abundant diversity among the turnip samples. Tibetan turnips showed a higher diversity than the whole population, which was revealed in the larger value of H of morphological traits, especially the quantitative traits.
The morphological dendrogram based on the clustering method of average connection clearly showed the genetic relationship among turnip varieties. A total of 171 varieties were divided into two large subgroups. The first cluster (I) comprised six varieties from Japan and Europe and 18 from China (mainly from Tibet). The second cluster (II) comprised the rest of the turnip varieties tested ( Figure S2). Additionally, 118 turnip varieties from Tibet were clustered into two large subgroups. Varieties from Changdu and the other nine varieties were clustered in subgroup1(I), and the rest were included in subgroup 2 (II; Figure S3). The Wilcoxon rank-sum test showed that the plant height, plant length, plant width, lamina length, lamina width, and petiole thickness were significantly different (p < 0.05, Table S6). Namely, these six aboveground traits were the primary contributors to population clusters.

Genetic Variation of SSR Markers
In the present study, 161 alleles were produced by 31 polymorphic SSR loci, ranging from 3 to 9 with an average of 5.19 alleles per locus for the genetic characterization of 171 turnip varieties. The obtained values on alleles were higher compared to values in previous studies, indicating the higher allele richness. The length of the amplified fragments was between 100 and 500 bp. A total of 181 amplified fragments were detected, and 163 of them were polymorphic; the polymorphic rate reached 90.1%.

Genetic Similarity Coefficient Analysis and UPGMA Cluster Analysis Based on SSR Markers
Wide genetic variations were observed in all the varieties: the genetic distance and the coefficient of genetic similarity (GS) in 171 varieties ranged from 0.12 to 1.00 and 0.73 to 0.95 (data was not shown). For 118 turnip landraces from Tibet, the genetic distance and the GS ranged from 0.12 to 0.93 and 0.72 to 0.93, respectively, suggesting wide genetic variations in each variety from Tibet.
A phylogenetic tree divided the 171 varieties into two subgroups when GS was from 0.73 to 0.732 ( Figure S4). The cophenetic correlation coefficient value between the original similarity matrix and the dendrogram was rSSR = 0.65. The first subgroup comprised 28 varieties, including 11% landraces from Tibet, 36% from other provinces of China, and 36% from Japan; the second subgroup comprised 143 varieties, including 89% landraces from Tibet, 64% varieties from Japan, 64% varieties from different provinces of China and all the European, Korean, and Chinese cabbages. Subgroup II with a GS of 0.742 was further divided into two clusters. Similarly, all 118 landraces from Tibet were divided into two major subgroups at a GS of 0.72 to 0.729 ( Figure S5). Thirteen varieties mainly from Changdu (six), Xigaze areas (four), and Shannan (two) belonged to subgroup 2, and 105 varieties from various regions of Tibet belonged to subgroup 1 ( Figure S6). Na = no. of different alleles; Ne = no. of effective alleles = 1/(pˆ2 + qˆ2); I = Shannon's information index = −1* (p * Ln (p) + q * Ln(q)); He = expected heterozygosity = 2 * p * q; uHe = unbiased expected heterozygosity = (2N/(2N-1)) * He. no. allele = number of alleles; PIC = polymorphic information content (PIC).

Relationship between Morphological Traits and SSR Markers
Mantel testing revealed that correlations of the similarity coefficient matrices between the morphological traits and SSR markers were very weak ( Figure S7, r = -0.04, p = 0.10), confirming the discordance between morphological traits and SSR markers.
Cophenetic correlation values between the original genetic similarity coefficient matrices and the cophenetic value matrices calculated from morphological traits and SSR markers were rMOR = 0.57 and rSSR = 0.65, respectively, indicating a moderate fit for both data. However, the value was up to 0.82 when a combined data of morphological traits and SSR markers was analyzed, which indicates a higher-quality clustering of the combined data. The UPGMA dendrogram separated 171 turnip varieties into two main clusters, I and II, at the similarity coefficient of 0.51, as shown in the morphological and molecular dendrogram ( Figure S8). The Wilcoxon rank-sum test showed significant differences in plant length, plant width, the number of leaves, fleshy tuber width, flower length, flower width, sepal length, petal length, petal width, pistil width, cotyledon color, leaf shape, leaf color, the scar on the tuber shoulder, and the cross-section shape of petiole between the two subgroups based on the SSR dendrogram (p < 0.05). Similarly, significant differences in plant width, the number of leaves, leaf shape, leaf color, the exterior color of the aboveground tuber, and the cross-section shape of petiole were also observed between the two subgroups of Tibet landraces (Table S7).

Population Structure and Principal Component Analysis of the Turnip Germplasm
The genetic structure of the population showed that lnP(D) increased consistently, while the K value increased from 1 to 10. The most apparent change in lnP(D) was observed when K was increased from 2 to 3 (Figure 2a), and a clear peak for ∆K was obtained when K = 2. The optimum number of subpopulations was 2 according to the plot of delta K against K [36] (Figure 2b). Accordingly, the 171 turnip varieties were roughly divided into two major populations, namely, P1 (green color) and P2 (red color), which contained 31 and 140 varieties, respectively (Figure 2c,d) (Q-matrices were not shown). The percentage of genotypes with a membership coefficient ≥90% was 61.99%. A total of 74.85% of genotypes exhibited a membership coefficient ≥80%. P1 was comprised of 11% of the landraces from Tibet, 44% of the varieties from Japan, and 30% of those from other provinces of China. The percentage of genotypes with a membership coefficient ≥80% was 67.74%. P2 contained 89% of the landraces from Tibet, 56% of the varieties from Japan, 70% of the varieties from different provinces of China, and all the European, Korean, and Chinese cabbages (Table S8). The percentage of genotypes with a membership coefficient ≥80% was 76.43%. Additionally, the genetic structure of 118 varieties from Tibet was analyzed separately ( Figure S9a-d). The varieties comprised two populations: P1 (green color), which contained 13 varieties and P2 (red color), which contained 105 varieties, which was consistent with the results of the whole population. The percentage of genotypes with a membership coefficient ≥90% and ≥80% was 67.80% and 83.90%, respectively. A total of 76.92% of the genotypes exhibited a membership coefficient ≥80% in P1 and 84.76% in P2. In general, both population structure analysis of 171 varieties and 118 Tibetan landraces demonstrated high accordance with the cluster analysis.
The principal component analysis (PCA) divided 171 turnip varieties into two groups, k_ 1 and k_2 (Figure 3), which were matched to P1 and P2, and their relationship was consistent with the population structure. The main components of PCA 1 and PCA 2 explained 9% and 5.3% of the molecular variations, respectively. Another two clusters for 118 varieties from Tibet also emerged, k_1 and k_2 ( Figure S10), which were contained in 13 and 105 landraces and matched to the population structure as well. The main components of PCA 1 and PCA 2 explained 9.1% and 3.6% of the molecular variations, respectively. Results of the PCA corroborated the cluster analysis and the population structure results.

Construction and Evaluation of the Primary Core Collection in Tibetan Turnip
Three primary core collections including 34, 26, and 19 accessions based on different sampling ratios of 10, 15, and 20 percent were constructed along with the stepwise clustering method. In the 20% collection, 24 were from Tibet, 6 were from abroad, and 4 were from other provinces. In the 15% collection, 18 samples were from Tibet, 6 were from abroad, and 2 were from other provinces of China, except Tibet. In the 10% collection, 12 samples were from Tibet, 5 were from abroad, and 2 were from other provinces. Germplasms with maximum or minimum values of morphological traits were preserved (Table S9). The primary core collection is considered to be the representative of the initial collection when MD ≤20% and CR ≥80% [42]. The VD, CR, and VR values as well as the strength of the representation of the core collection increased with decreasing MD. In this study, the evaluation parameters of each primary core collection showed that VD and VR based on morphological traits at the sampling ratio of 15% were greater. Additionally, the mean value of Na, gene diversity, PIC, Ne, I, as well as He based on molecular data were all higher than initial populations or other pre-core collections ( Table 2). All results indicated that the primary core collection had the best representation when the sampling ratio was 15%. Further, the 15% primary core collection was assessed with PCA based on morphological data. The eigen value, the contribution percentage, and the cumulative contribution percentage of the primary core germplasm were larger than those of the initial population (Table S10). Hence, the 15% primary core collection effectively eliminated genetic redundancy and increased the cumulative contribution percentage.  ; I = Shannon's Information Index = -1* (p * Ln (p) + q * Ln(q)); He = expected heterozygosity = 2 * p * q; uHe = unbiased expected heterozygosity = (2N/(2N-1)) * He.

Discussion
Understanding the genetic relationship among the collected Tibetan turnip landraces will be the key for future research on germplasm innovation and breeding cultivars [1]. The combination of phenotypic trait investigation and SSR molecular markers is an essential and complementary method for genetic diversity analysis [17,43]. The precision of identification could be increased by the effective utilization of the associated characters with molecular markers [44]. In the present study, the genetic relationships among 171 turnips varieties were investigated based on morphological traits and SSR markers. All SSR markers displayed a high polymorphic information content (PIC = 0.516). The number of alleles and PIC were higher than previous studies reported on turnips [1,5]. The morphological traits also indicated an abundant diversity among all turnip varieties consistent with previous studies [5]. The clustering maps estimated from 58 morphological traits and 31 SSR markers demonstrated discordance to a certain extent. This was confirmed by the results exhibited by the Mantel test, consistent with earlier studies on other vegetable species [17,45]. Unexpectedly, 31 genetic and 58 morphological markers both showed a moderate-quality clustering, which indicates they were not identifying the varieties well separately. Additionally, the cophenetic correlation values showed a higher-quality clustering when a combined data of morphological traits and SSR markers were analyzed. These results indicated that the SSR markers and morphological traits were unable to replace each other solely to identify the distinctness of turnip varieties, but a combination of morphological traits and molecular markers could improve the precision of accurate identification compared to the separate morphological and molecular data analyses [17]. The reason might be that (i) the SSR markers are designed based on the non-coding DNA regions according to the whole genomic diversity, (ii) morphological traits are the product of synthetic biology and the environment, and only a few select genes cause the extreme morphologies [24]. Results from the Wilcoxon rank-sum test revealed significant differences in some qualitative and quantitative traits, such as plant width, leaf color, the number of leaves, and tuber characteristics between the two subgroups from the molecular-based tree (p < 0.05) (Tables S6 and S7), suggesting that these traits are possibly associated with SSR molecular markers and are worthy of further study.
Previous studies revealed that East Asia, with its ancient and extensive trade routes, is at the center of diversity of B. rapa [2,46]. Results of morphological traits and molecular markers in this study indicated rich genetic diversity among turnip varieties. The higher gene diversity, the larger number of alleles, the abundant PIC, and the wider H of morphological traits found in 118 turnip landraces showed that the genetic diversity of Tibetan landraces was higher than the whole population [5]. The values obtained on alleles were higher than the values in previous studies [1]. The genetic distances of 118 turnip landraces (0.121-0.932) featured much higher genetic diversity than the varieties found in Xinjiang "Qamugur" (the local name for turnip in Xinjiang, 0.098-0.160), turnips from different provinces of China (0.187-0.344), and varieties from Nordic areas (0.016-0.146) [1,47]. Moreover, a 15.5% variation in genome size has been observed between the landraces from two high-altitude regions [48]. This result may explain the much higher genetic diversity in Tibetan turnip landraces. These results indicated that landraces often harbor rich diversity that is essential for maintaining local traditional agricultural sustainability [17].
The enormous phenotypic variation was displayed in Brassica species, including heading and tuber-forming morphotypes due to the mesohexaploidization of two Brassica genomes [49]. Various preferences for turnip organs from different regions led to different morphological selections [50]. Unlike most landraces from Tibet with large bulk tubers ( Figure S1), turnips from Japan all have ascending rosettes, smaller taproots, and narrow leaves, and turnips from Europe all have long, slender tubers and deep leaf fissures [51]. Previous studies have reported that the thickened part of turnips consists of both a hypocotyl and a root, and this group was one of the oldest groups of cultivated B. rapa [22,52,53]. Europe and Asia are two different independent origin centers of cultivated B. rapa subspecies [2]. In the European center, turnip and turnip rapeseed (oleifera) are the dominant forms, but Chinese turnip rape seed (oleifera) and leafy vegetables like Chinese cabbage, pak choi, and narinosa first originated from China. Subsequently, manifold morphotypes of different B. rapa accessions were derived and evolved separately from these two origin centers [54]. In the present study, the B. rapa ssp. pekinensis (Chinese cabbages) from China and Korea clustered together as neighbors and showed the smallest genetic distance, thus supporting the hypothesis that Chinese cabbage is the ancestor of Korean cabbage. Chinese cabbages were first introduced into ancient Korea during the Ming dynasty (1368 to 1644) and became the main ingredient in kimchi [2,55]. B. rapa ssp. rapa from Europe, B. rapa ssp. pekinensis, and 89% turnip landraces with high diversity from Tibet clustered together, implying a certain genetic relationship between these B. rapa subspecies. The proposed Chinese cabbage (ssp pekinensis) were hybrid from the northern turnip (ssp. rapa), and southern pak choi (ssp. chinensis) found in the wild in China could support this result [55,56]. Previous studies supported that the Mediterranean region, Afghanistan, Pakistan, Transcaucasia, and other Central Asia regions are the origin centers of ssp. rapa [2,16,51,55,57]. Moreover, De Candolle reported that ssp. rapa was cultivated before 2500-2000 B.C., much earlier than being introduced into Eastern Asia during the period after 1000 B.C. Considering these and combining the results of an earlier study [7], these results supported the hypotheses that either Tibet is one of the B. rapa gene centers worldwide [7] or Tibetan turnip was a branch of European turnip with various morphological traits. However, this requires further verification due to the limited sources of European groups and Chinese cabbages employed in this study. Similar to turnips from Tibet, turnips from Japan clustered into two groups, indicating that the turnips from Japan most likely had different origins [51], and Tibetan turnips have also been influenced to a certain degree.
Morphological traits and molecular markers resulted in the division of 171 varieties into two groups consistent with the population structure and the PCA, indicating the underlying relationships among populations with the geographical origin [58]. In previous research, Pradhan et al. assessed the genetic diversity among 180 B. nigra accessions by using SSR markers, revealing that B. nigra (L.) accessions from the same country of origin inclined to cluster together [59]. The genetic diversity of 25 turnips from Tibet, Gao found that their classification was related to the geographical origin [5]. Varieties that originated far apart showed fewer genetic similarities than those from adjoining geographical regions [44]. Furthermore, a similar relationship was also found in many Brassica vegetables [60,61]. However, some studies argued that population clustering could not be performed in accordance with geography but rather that it corresponds to morphotype [62][63][64] and flowering habit over geography [65], which was generally observed in previous studies [2,16]. Still, some have argued that both morphotype and geography contribute to the population structures of B. rapa subspecies and that broad morphological groups, such as oilseed and pak choi, were a primary contributor for population clusters [51]. In the present study, approximately 90% of the landraces from Tibet were clustered together, possibly because they all originated from one geographical region, and genetic exchanges between them or between them and other varieties from other places/origins are limited [16]. However, the 171 varieties were not divided into three large groups following their respective geographical origins. Plant length and width were significantly different whether in morphological or molecular trees. Therefore, we considered that both morphotype and geography contribute to the group classification of B. rapa subspecies, and the morphological traits of groups from the same geographical origin are similar due to the absence of gene communication, natural environment, and morphological selections [66]. In addition, most landraces collected from different regions of Tibet were clustered together ( Figure S10), indicating the presence of more accession exchanges and higher purity of landraces that potentially exist in Tibet [5].
Most researchers believe that 10-20% of the sampled scale in vegetable crops should encompass the genetic diversity of the initial population [40,67]. In the present study, we designed an equal scale core set. A minimum number of collections that represent maximum genetic diversity of an entire population is best [68]. Our results showed the primary core collection with a 15% sampling ratio represented the best genetic diversity and genetic variation in the whole population whether based on morphological traits or SSR markers, which is consistent with previous studies [69]. Although most international genetic turnip resources in this study lack detailed geographical origins, the core collection is conducive to be stored and exchanged in the later processes of genetic resources conservation and utilization.

Conclusions
Tibetan turnips feature high genetic diversity, and they could be clustered into two subgroups based on morphological traits and SSR markers. Both morphotype and geography could contribute to the group classification of B. rapa subspecies, and a combination of morphological traits and molecular markers could improve the precision of accurate identification compared to the separate morphological and molecular data analyses. The sampling ratio of 15% to most accurately represent the initial population was compared with 10% and 20%. Therefore, when a primary core collection of turnip resources is constructed, a sampling ratio of 15% is recommended for future works. Tibetan turnip resources would greatly benefit from expanded sampling, including more turnip landraces throughout Tibet to construct a comprehensive Tibetan turnip landrace bank and to increase the breeding of seed samples for conserving genetic turnip resources. This study provides a foundation and new insights for research into germplasm innovation and breeding turnip cultivars.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/agronomy11101901/s1, Figure S1. Morphological diversity of partial turnip varieties (overground and underground). Codes (177-01~177-171) for varieties found in Table S1. Figure S2. Hierarchical cluster analysis of 171 turnip varieties based on morphological traits. Codes (177-01~177-171) for varieties found in Table S1. I = group 1; II = group 2. Figure S3. Hierarchical cluster analysis of 118 turnip landraces from Tibet based on morphological traits. Codes (177-01~177-171) for varieties found in Table S1. I = group 1; II = group 2. Figure S4. UPGMA dendrogram by cluster analysis of 171 turnip varieties based on SSR markers. Codes (177-01~177-171) for varieties found in Table S1. I = group 1; II = group 2. Figure S5. UPGMA dendrogram by cluster analysis of 118 turnip landraces collected from Tibet based on SSR markers. Codes (177-01~177-171) for varieties found in Table S1. I = group 1; II = group 2. Figure S6. Geographical origin location of landraces from Tibet by cluster analysis based on SSR markers. Figure S7. Comparison of genetic similarity coefficient matrix between morphological traits and SSR markers in turnip varieties by mantel test. Figure S8. UPGMA dendrogram by cluster analysis of 171 turnip varieties using combined morphological traits and SSR markers. Codes (177-01~177-171) for varieties found in Table S1. I = group 1; II = group 2. Figure S9. Population structure of 118 Tibetan turnip landraces based on SSR analysis. Codes (177-01~177-171) for varieties found in Table S1. a: Estimation of the optimum number of groups (K). The sharp peak of K at K = 2 suggests two subpopulations. b: Graph for the parameter Ln P(D) and number of (K). c: Population structure when K = 2. d: Population structure when K = 3. The proportion of each color indicates the probability of each variety being divided into the corresponding group. Figure S10. Principal component analysis of 118 turnip landraces collected from Tibet. All varieties were aggregated into two groups, k_1 and k_2, matched to population structure. Table S1. Turnip varieties and landraces used in this study, and their geographical origin. Table S2. All morphological traits measured in this study. Table S3. Thirty-one pairs of SSR primers with polymorphism and their basic information. Table S4. Variation analysis of qualitative traits for turnip varieties. Table S5. Variation analysis of quantitative traits in turnip varieties. Table S6. Traits with significant differences between groups based on morphology-tree. Table S7. Traits with significant differences between groups based on dendrogram by SSR. Table S8. Varieties' structure and their geographical origin in P1 and P2. Table S9. Construction of primary core collection based on different sampling scales. Table S10. Eigen value and cumulative contributive percentage for the initial population and pre-core collection.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.