Next Article in Journal
Exploration of Molecular Mechanism and Key Factors for the Survival of ‘Yueshenda 10’ Cuttings Under ABT1 Treatment
Previous Article in Journal
Dynamic Coupling Mechanism of Soil Microbial Community Shifts and Nutrient Fluxes During the Life Cycle of Dictyophora rubrovolvata
Previous Article in Special Issue
Genetic Diversity Analysis of Sugar Beet Multigerm Germplasm Resources Based on SRAP Molecular Markers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Developing Chinese Sugar Beet Core Collection: Comprehensive Analysis Based on Morphology and Molecular Markers

1
Academy of Modern Agriculture and Ecological Environment, Heilongjiang University, Harbin 150080, China
2
Key Laboratory of Sugar Beet Genetic Breeding, Heilongjiang University, Harbin 150080, China
*
Authors to whom correspondence should be addressed.
Horticulturae 2025, 11(8), 990; https://doi.org/10.3390/horticulturae11080990
Submission received: 10 July 2025 / Revised: 8 August 2025 / Accepted: 19 August 2025 / Published: 20 August 2025
(This article belongs to the Special Issue Genomics and Genetic Diversity in Vegetable Crops)

Abstract

Sugar beet (Beta vulgaris L.) is a biennial herbaceous plant belonging to the genus Beta within the family Amaranthaceae. Its root tuber can be used as an effective source for sucrose production. In the pursuit of sustainable development and maximizing the economic value of crops, the full utilization of crop germplasm resources and efficient production is necessary. To better facilitate the collection and utilization of sugar beet germplasm resources, this study used 106 accessions of multigerm sugar beet germplasm provided by the Key Laboratory of Molecular Genetic Breeding for sugar beet as materials. We evaluated the core collections constructed under various strategies using relevant genetic parameters and ultimately established two core collection construction strategies based on morphological and molecular markers. The optimal strategy based on morphological data was “Euclidean distance + Multiple clustering deviation sampling + UPGMA + 25% sampling proportion”, while the optimal strategy based on molecular marker data was “Jaccard distance + Multiple clustering random sampling + UPGMA + 20% sampling proportion”. In addition, representativeness evaluation of the core collection was conducted based on parameters related to both morphology and molecular markers. Principal component analysis (PCA) was utilized for the final determination of the core collection. The results showed that for both the morphological parameters and molecular marker-related parameters, there were no significant differences between the constructed core collection and the original germplasm; the phenotypic distribution frequencies were basically similar. Principal component analysis indicated that the core collection possessed a population structure similar to that of the original germplasm. The constructed core collection had good representativeness. This study, for the first time, proposed a core collection construction approach suitable for sugar beet by integrating morphological and molecular marker methodologies. It aimed to provide a scientific basis for the utilization and development of sugar beet germplasm resources, genetic improvement, and the breeding of new cultivars.

1. Introduction

Sugar beet (Beta vulgaris L.), a biennial, herbaceous, cross-pollinated crop belonging to the Amaranthaceae family, exhibits self-incompatibility. As the world’s second-largest sugar crop, it fulfilled 21% of global sugar demand in 2021 [1]. The taproot is rich in nutrients and elements such as potassium, sodium, and nitrogen. Furthermore, it serves as a primary source of bioethanol, products crucial for addressing energy shortages [2].
Sugar beet originated along the European coast, with evidence dating back to 8500 BC. In 1747, Andreas Marggraf discovered that its taproots were an efficient source of sucrose [3]. Subsequently, in 1905, the Polish merchant Czajkowski established the first sugar beet factory in Acheng, Heilongjiang Province. Sugar beet was introduced to China in 1906, initially cultivated in Heilongjiang. Today, cultivation has expanded to regions including Inner Mongolia, Xinjiang, and Gansu [4]. Sugar beet is not native to China, resulting in scarce wild germplasm resources and low genetic diversity. Consequently, the collection and utilization of sugar beet germplasm resources are particularly crucial. However, the vast number of these resources complicates their research and utilization by breeders. To enhance understanding of the genetic diversity composition and characteristics of collected resources, improve germplasm utilization, and broaden the genetic background of new sugar beet varieties, establishing a core collection is of great significance. Furthermore, selecting superior sugar beet germplasm resources is of immense importance in meeting the escalating demand for sugar due to population growth and rising living standards, as well as addressing productivity declines caused by rapid climate change and land scarcity in major sugar beet cultivation areas.
The steady expansion of planting scale, coupled with the constant renewal of plant materials, has posed numerous inconveniences for the utilization of germplasm resources. A core collection refers to a subset of germplasm that captures the maximal genetic diversity of the original population with a minimal number of accessions and minimal genetic redundancy [5]. This approach significantly saves time and funds for breeding programs and possesses the attributes of representativeness and practicality. A core collection is a condensed subset of the entire germplasm population. It offers advantages such as superior comprehensive agronomic traits, a broad genetic foundation, and robust combining ability. Crucially, the core collection preserves the genetic diversity of the original germplasm while minimizing the number of accessions, thereby significantly enhancing utilization efficiency [6]. Germplasm materials excluded from the core collection are not discarded; instead, they form a supplementary reserve collection [7]. Core collections provide a practical approach for understanding the genetic characteristics of germplasm resource populations. For example, to investigate cassava genetic diversity, Fregene et al. [8] selected and evaluated 76 core collection using SSR markers. Similarly, when studying gene introgression in soybean using molecular markers, Amirul et al. [9] prioritized selecting experimental materials from the core collection. Since Frankel first proposed the concept of core collections in 1984 [10], numerous scholars have constructed such collections using morphological or molecular markers. This work has advanced the development of diverse construction theories and strategies. By the end of the 20th century, over 60 core collections had been established across 51 species [11]. Chen et al. [12] examined 32 phenotypic traits of tomato germplasm and analyzed genetic diversity and population structure using 48 pairs of SNP markers. They adopted a construction strategy combining Mahalanobis distance, 10% sampling intensity, preferred sampling, and the weighted pair-group average method, ultimately establishing 48 core collections. Guo et al. [13] conducted a genetic diversity analysis of cypress using SSR molecular markers, constructing 31 core collections and DNA fingerprints. Xiong et al. [14] assessed the genetic diversity of Hippeastrum hybridum using Sequence-Related Amplified Polymorphism (SRAP) molecular markers and constructed a core collection; compared to the original germplasm, the core collection demonstrated better genetic diversity in terms of H and I. Certel et al. [15] utilized both SRAP and IPBM molecular markers to reveal the genetic diversity of the common bean in Western Turkey, construct a core collection, and characterize cold tolerance traits within this core collection to develop a cold-tolerant germplasm. Wang et al. [16] successfully established the core collection of pine by integrating SRAP molecular markers with morphological traits. It is evident that SRAP molecular markers have been extensively applied in research such as crop genetic diversity analysis and core germplasm construction. Research on core collection establishment in China began relatively late, initially concentrating on staple food crops such as rice, wheat, and maize [17,18,19]. Furthermore, most early studies relied exclusively on either phenotypic traits or molecular markers, with few integrating both approaches.
Currently, there are limited research reports on the establishment of a sugar beet core collection both domestically and internationally, and the methodologies employed for constructing such a core collection vary [20,21]. Therefore, conducting a comprehensive evaluation of sugar beet germplasm resources and establishing a core collection will significantly advance the process of sugar beet genetic improvement. In this study, 106 accessions of sugar beet germplasm resources provided by the Key Laboratory of Molecular Genetic Breeding for Sugar Beet were used as materials. Based on 29 morphological traits and 24 pairs of highly polymorphic SRAP markers, we developed core collection construction methods suitable for different data types through integrated analysis. The representativeness of the core collection was evaluated. The purpose of this study was to provide a scientific basis for the utilization and development of sugar beet germplasm resources, genetic improvement and breeding of new varieties.

2. Materials and Methods

2.1. Plant Materials

The 106 germplasm resources (Table S1) under test were supplied by the Key Laboratory of Molecular Genetics and Breeding for Sugar Beet (Heilongjiang University, Harbin, China) and were cultivated in May 2024 at the Experimental Base of Heilongjiang University in Hulan District, Harbin (45°59′50.291″ N, 126°38′25.292″ E, Figure 1), where the annual average temperature is 3.3 °C, the annual average precipitation is 505.4 mm, and the annual average sunshine duration is 2661.4 h. A randomized block design was used in the field, with three replicates, with each experimental plot consisting of two rows, each 10 m long, separated by a row spacing of 0.68 m and a plant spacing of 20 cm. The seeds were sown at a depth of 3 cm, and the same water and fertilizer management practices as those used in commercial field cultivation were adopted.

2.2. Morphology Data Acquisition

Twenty-nine agronomic traits were surveyed utilizing the “Specification and Data Standard for Sugar Beet Germplasm Resources Description (Beta vulgaris L.)” [22]; these include 13 quantitative traits (seedling weight, leaf number, root width, root length, aspect ratio, vascular bundle, root yield, sugar yield, brix, sugar content, K+ content, Na+ content, α-N content) and 16 descriptive traits (hypocotyl color, growth vigor, chromosome, leaf color, leaf shape, leaf margin shape, leaf surface, mesophyll thickness, petiole width, petiole length, fascicled leaves type, root shape, root groove, skin, flesh color, flesh coarseness). In each test area, the average value of the measured values of 10 sugar beet plants was used as the value of brix, sugar content, K+ content, Na+ content, and α-N content, and 50 sugar beet plants were randomly selected from each test area for other traits. Image J 1.8 [23] was employed to measure leaf and root-related traits, with statistical analysis conducted using the mean values. Weighing was performed using an electronic balance with a precision of within 0.01 g. Furthermore, the descriptive traits were qualitatively assessed based on ratings and coding (Table 1).

2.3. Molecular Markers Data Acquisition

Sample collection was carried out at the seedling stage of sugar beet. Leaf samples of 106 germplasm resources were collected, and the leaves of 20 plants were randomly selected from each germplasm resource and mixed into Eppendorf tubes. The genomic DNA of sugar beet was extracted by the CTAB method. [24]. A total of 24 core primer pairs (Table S2), preselected from 546 SRAP primer combinations through preliminary screening, were used by collaborators within our research team. A total of 127 alleles were amplified, with the number of alleles amplified by different primer combinations ranging from 3 to 7. Percentage of Polymorphic Loci (PPB)of the primers ranged from18.49% to 55.97%, exhibiting an average value of 36.56%
The PCR amplification protocol employs a Touch-down approach: initial denature tion at 94 °C for 3 min; followed by denaturation at 95 °C for 15 s; annealing at 65 °C for 15 s (with a decrease of 1 °C per cycle down to 56 °C, with two cycles at each temperature); and extension at 72 °C for 30 s. Subsequently, denaturation at 94 °C for 15 s, annealing at 55 °C for 15 s, and extension at 72 °C for 30 s were repeated 20 times; and a final extension was performed at 72 °C for 5 min.

2.4. Data Handling

Data organization was performed using Excel 2021, and core collection construction was conducted using QGA Station [25]. For morphological data, Euclidean distance and Mahalanobis distance were used as genetic distances; for molecular marker data, Nei and Li, Jaccard, and Simple matching genetic distances were used. The sampling methods were the multiple clustering random sampling method and the multiple clustering deviation sampling method. Sampling proportions of 5%, 10%, 15%, 20%, 25%, and 30% were applied. The clustering methods employed were the single linkage method (SL), complete linkage method (CL), median method (MM), centroid method (CM), unweighted pair-group averages method (UPGMA), weighted pair-group averages method (WPGMA), flexible method (FM), and Ward’s method (WM). For morphological data, four parameters were selected as evaluation criteria: the mean difference percentage (MD), variance difference percentage (VD), coincidence rate of range (CR), and changeable rate of coefficient of variation (VR). For molecular marker data, four parameters were used as evaluation criteria: the percentage of variations retained (VP), index of genetic diversity (He), average expected heterozygosity (H), and average effective number of alleles (A). Combinatorial comparisons of different genetic distances, sampling methods, clustering methods, and sampling proportions were performed to identify the optimal approach for constructing the sugar beet core collection. SPSS 25 [26] software was utilized to compute the mean, variance, and range for 29 phenotypic traits, and to generate a cluster diagram. The phenotypic traits were examined using a one-way ANOVA test. Additionally, Excel was employed to calculate the coefficient of variation (CV) and Shannon’s diversity index (H’). The molecular marker data are processed under the same mobility; if there is a band, it is assigned a value of “1”, and if there is no band, it is assigned a value of “0”. In this form, the 0, 1 binary data matrix is constructed. The software POPGENE 32 [27] was utilized to obtain genetic diversity indicators such as the average number of alleles (Na), average number of effective alleles (Ne), Nei’s genetic diversity index (H), and Shannon’s information index (I). The molecular marker clustering results were derived from MEGA 11 [28] software. Additionally, Origin 21 [29] software was employed to create 16 frequency distribution maps descriptive traits, as well as PCA maps illustrating phenotypic traits and molecular markers.

3. Results

3.1. Selection of Genetic Distances

For morphological data, core collections were constructed using two genetic distances, two sampling methods, and eight clustering methods. As shown in Table 2, the MD values of the core collections constructed under both genetic distances were less than 20%, and the CR values were greater than 80%, indicating that the core collections established using either genetic distance could faithfully represent the genetic diversity of the original population. Comparison between the Euclidean distance and Mahalanobis distance revealed that although the VD and VR values of the core collection constructed using Mahalanobis distance with the multiple clustering deviation sampling method were higher than those using Euclidean distance, the core collections constructed with Euclidean distance were overall superior in performance to those using Mahalanobis distance. Therefore, the Euclidean distance was selected for subsequent core collection construction utilizing sugar beet morphological data.
For molecular marker data, core collections were constructed using three genetic distances, two sampling methods, and eight clustering methods. Further comparison of the core collections obtained with different genetic distances across the four evaluation parameters (Table 3) revealed that the Jaccard genetic distance exhibited superior overall performance. Therefore, the Jaccard distance was designated as the optimal genetic distance for constructing the sugar beet core collection using molecular marker data.

3.2. Selection of Sampling Methods

For morphological data, core collections were constructed using the Euclidean distance combined with two sampling methods and eight clustering methods. Based on the core collection evaluation criterion of MD < 20%, it was found that there was no significant difference between the multiple clustering random sampling method and the multiple clustering deviation sampling method regarding the MD. However, comprehensive analysis of the remaining parameters revealed that the multiple clustering deviation sampling method significantly outperformed the multiple clustering random sampling method (Table 2). Therefore, the multiple clustering deviation sampling method was identified as the optimal sampling method for constructing the sugar beet core collection using morphological data.
For molecular marker data, core collections were constructed using the Jaccard distance combined with two sampling methods and eight clustering methods (Table 3). Although the multiple clustering deviation sampling method retained a greater number of the A, the multiple clustering random sampling method retained the highest VP, He, and the highest H. Therefore, the multiple clustering random sampling method was determined to be the optimal sampling method for constructing the sugar beet core collection using molecular marker data.

3.3. Selection of Clustering Methods

Under the fixed conditions of the selected genetic distance and sampling method for morphological data, comparisons among different clustering methods revealed that although the SL retained the highest VD, it exhibited a larger MD. In contrast, the UPGMA demonstrated superior performance in the remaining evaluation parameters (Table 2). Therefore, UPGMA was identified as the optimal clustering method for constructing the sugar beet core collection from phenotypic data.
Under the fixed conditions of the selected genetic distance and sampling method for molecular marker data, comparisons among different clustering methods revealed that the SL, CL, MM, UPGMA, and WPGMA all retained the highest percentage of VP. However, comparison of the remaining parameters showed that UPGMA demonstrated relatively superior performance (Table 3). Therefore, the UPGMA was determined to be the optimal clustering method for constructing the sugar beet core collection using molecular marker data.

3.4. Selection of Sampling Proportion

For morphological data, utilizing the Euclidean distance and the UPGMA for clustering, combined with the multiple clustering deviation sampling method, core collections were constructed at six sampling proportions: 5%, 10%, 15%, 20%, 25%, and 30%. Comparison of the six core collections (Table 4) revealed that the core collection with a 5% sampling proportion had a CR value less than 80% and was therefore insufficient to represent the genetic diversity of the original germplasm. The core collection with a 20% sampling proportion exhibited the largest MD value among the six proportions and consequently failed to effectively represent the genetic diversity of the original germplasm. Among the remaining four core collections, the CR value increased with increasing sampling proportion, reached its maximum value at the 25% proportion, and remained unchanged thereafter. Additionally, the change rate in VR value was the highest at the 25% sampling proportion. Therefore, a sampling proportion of 25% was determined to be the optimal proportion for constructing the sugar beet core collection using phenotypic data.
For molecular marker data, utilizing the Jaccard distance and the UPGMA for clustering, combined with the multiple clustering random sampling method, core collections were constructed at six sampling proportions: 5%, 10%, 15%, 20%, 25%, and 30%. As shown in Table 5, the values of the He and H increased with an increasing sampling proportion. At the 20% sampling proportion, the percentage of VP reached its maximum value and remained constant thereafter. Similarly, the A attained its maximum value at the 20% proportion but subsequently exhibited a declining trend. Therefore, a sampling proportion of 20% was identified as optimal for constructing the sugar beet core collection using molecular marker data.
Consequently, the best solution of the sugar beet core collection construction using morphological data is “Euclidean distance + Multiple clustering deviation sampling + UPGMA + 25% sampling proportion”, whereas the optimized protocol for molecular marker data is “Jaccard distance + Multiple clustering random sampling + UPGMA + 20% sampling proportion”.

3.5. Clustering Analysis Based on Morphology and Molecular Markers

Clustering analysis was conducted on the original germplasm resources based on 29 phenotypic traits, resulting in their classification into four groups (Figure 2A). Group I comprises 36 germplasm resources, accounting for 33.96% of the total, primarily characterized by tongue-shaped leaves with wavy surfaces, cuneiform-shaped roots, and the highest α-N content. Group II includes 16 germplasm resources, accounting for 15.09%, distinguished by the highest leaf number, potassium and sodium levels; notably, this group exhibits prominent tuberous root volume and uniformly white root flesh. Group III consists of 14 germplasm resources, accounting for 13.21%, demonstrating vigorous seedling growth, the highest 100-seedling weight, light green leaves, erect fascicled leaves type, and exceptional sugar content; it is worth noting that the majority of this group are tetraploid germplasm resources, which exhibit superior traits across agronomic characteristics. Group IV encompasses 40 germplasm resources, accounting for 37.74%, predominantly featuring red hypocotyls, share-shaped leaves, rough root bark, and conical-shaped roots. The original germplasm was categorized into four groups through cluster analysis utilizing SRAP molecular markers (Figure 2B). The respective sample sizes for Groups I to IV were 3 (2.83%); 42 (39.96%); 7 (6.60%); and 54 (50.61%).

3.6. Construction of Core Collection

The original germplasm was classified into four clusters based on clustering results, with sampling proportions for each cluster’s core collection determined according to the final sampled size (Table 6). For morphological data, core collection construction employed the “Euclidean distance + Multiple clustering deviation sampling + UPGMA + 25% sampling proportion” methodology, while for molecular marker data, it utilized the “Jaccard distance + Multiple clustering random sampling + UPGMA + 20% sampling proportion” methodology. The core accessions constructed based on morphological and molecular marker data were distinct. By integrating the extracted core accessions from both datasets, redundant accessions common to both sets were consolidated into single entries, while unique accessions were fully retained. This process yielded a final core collection of 43 germplasm resources (Table S3).

3.7. Representative Evaluation of Core Collection Based on Morphology

One-way ANOVA indicated no significant differences in 26 phenotypic traits between the core collection and the original germplasm. The majority of traits exhibited greater variance in the core collection compared to the original germplasm, demonstrating effective reduction in genetic redundancy. Variation ranges of most traits were preserved within the core collection, confirming its representative capacity for trait variation amplitudes. Furthermore, the CV for most traits was higher in the core collection, while genetic diversity indices remained comparable to the original germplasm. (Table 7). Distribution frequency analyses of 16 descriptive traits revealed close alignment between the core collection and original germplasm (Figure 3). These results collectively demonstrate that the core collection closely represents the genetic diversity and population structure of the original germplasm repository.

3.8. Representative Evaluation of Core Collection Based on SRAP Molecular Markers

The representativeness of the core collection was evaluated using four genetic diversity indices: Na, Ne, H, and I. The original germplasm exhibited values of Na = 1.7492, Ne = 1.4256, H = 0.2614, and I = 0.3772. In comparison, the core collection showed marginally higher values for Ne (1.4321) and I (0.3815), while maintaining comparable levels for Na and H (Table 8). This demonstrates that the core collection effectively preserves the genetic diversity level of the original germplasm.

3.9. Finalization of the Core Collection

To further evaluate whether the finalized core collection adequately represents the original germplasm, PCA was utilized to finalize the composition of the constructed core collection. As illustrated in the principal component distribution map (Figure 4), the core collection has eliminated a significant number of redundant germplasms located near the center, and it is evenly distributed within the original germplasm. Consequently, the core collection has effectively retained the genetic diversity and population structure of the original germplasm, guaranteeing its validity and representativeness; the screening results are thus considered highly reliable.

4. Discussion

Germplasm resources collection and conservation exhibit both comprehensiveness and long-term sustainability, laying the foundation for breeding superior cultivars [30]. Core collection construction streamlines germplasm management and utilization [31], facilitating insights into the compositional characteristics of existing genetic diversity and its potential value. Simultaneously, it provides scientific guidance for future germplasm introduction and collection initiatives. Over decades, scholars have proposed diverse methodologies for core collection development. Initially, researchers constructed core collections based on phenotypic trait variations. However, phenotypic traits are susceptible to environmental influences, where climatic and edaphic variations may introduce biases [32]. Sole reliance on phenotypic data fundamentally fails to reflect genetic relationships among accessions [33]. However, there are some drawbacks to building a core collection using only molecular markers [34]. In some cases, SRAP markers may exhibit inherent linkage disequilibrium and uneven genomic distribution, potentially compromising core collection accuracy and representativeness [35]. Therefore, it was found that relying solely on a single information source for grouping was not ideal, and that a core collection constructed by integrating genetic information from morphology and molecular markers would be more representative [36].
In the selection of strategies for constructing a core collection, it is necessary to determine the optimal construction strategy, which involves parameters such as genetic distance, sampling proportion, sampling method, and clustering method [37]. Different genetic parameters will yield different results [38]. When constructing a core collection using morphological data, parameters such as MD, VD, CR, and VR are employed to evaluate the representativeness of core collections built under different genetic parameters, ultimately leading to the selection of the best strategy. A core collection best reflects the genetic diversity of the original germplasm population when it satisfies MD < 20% and simultaneously CR > 80%, with larger VR% and CR% values indicating better representativeness. For the core collection construction strategy based on morphological data (Euclidean distance + Multiple clustering deviation sampling + UPGMA + 25% sampling proportion), MD < 20%, CR was 95.28, VD was 17.24, and VR was 59.03. This indicates that the sugar beet core collection constructed using this strategy exhibits good representativeness. When constructing a core collection using molecular marker data, VP, He, H, and A are four critically important evaluation parameters. The core collection construction strategy determined based on molecular marker data (Jaccard distance + Multiple clustering random sampling + UPGMA + 20% sampling proportion) was able to retain the maximum number of VP while simultaneously maximizing the values of He, H, and A.
The data sources for the construction of this core germplasm are categorized into morphological data and DNA molecular marker data. Phenotypic data, often favored due to their intuitiveness and ease of manipulation, are susceptible to environmental influences [39]. Conversely, DNA molecular marker technology boasts stability and high sensitivity; while it enables precise genetic association analysis, it struggles to directly represent phenotypic trait variation [40]. The 29 phenotypic traits employed in this study encompass both quantitative and descriptive traits, constituting the most comprehensive characterization to date. In terms of selecting molecular marker types, SRAP molecular markers are noted for their high efficiency, simplicity, high yield, and excellent reproducibility. SRAP molecular markers have been extensively utilized in various fields, including genetic diversity analysis, gene mapping, and molecular marker-assisted selection, for a wide range of crops [41]. The assessment of the representativeness of a core collection primarily encompasses morphological data and molecular marker data. The evaluation criteria for morphological data include the CV, H’, variance, and phenotypic distribution frequency. Meanwhile, the evaluation of molecular marker data encompasses the Na, Ne, H, and I. The comparison between the core collection and the original germplasm revealed that, apart from leaf color, root width, and brix, there were no significant differences in the remaining phenotypic traits compared to the original germplasm. The phenotypic distribution frequency indicated that the trend in the distribution frequency of traits in the core collection was broadly similar to that of the original germplasm. However, the distribution frequencies of some traits were originally very low in the original germplasm. Because the sampling proportion of the core collection is relatively small, this led to the absence of these traits in the core collection. If these traits can be more finely distinguished in the future, this phenomenon can be alleviated.

5. Conclusions

This study investigated 29 phenotypic traits across 106 sugar beet germplasm accessions and utilized 24 SRAP molecular markers. Two distinct core collection construction strategies were employed, resulting in the development of a core collection. When constructing the core collection using morphological data, these accessions were found to exhibit high coefficients of variation and good diversity. After evaluating various construction strategies, the optimal strategy “Euclidean distance + Multiple clustering deviation sampling + UPGMA + 25% sampling proportion” was ultimately identified. Using this strategy, a core collection comprising 27 germplasm resources was constructed. When constructing the core collection using molecular marker data, following the evaluation of various construction strategies, the optimal strategy “Jaccard distance + Multiple clustering random sampling + UPGMA + 20% sampling proportion” was ultimately determined, resulting in a core collection containing 21 germplasm resources. It is noteworthy that the core collections constructed from morphological and molecular marker data shared five common germplasm resources. Consequently, the final core collection is composed of 43 germplasm resources. The representativeness of the core collection was evaluated. For the morphological data, the mean, variance, coefficient of variation CV, and H’ of the core collection showed no significant difference from those of the original germplasm, and the frequency distribution trends of the traits were similar to those in the original germplasm. Regarding the molecular marker data, the Na, Ne, H, and I of the core collection were similar to those of the original germplasm population. PCA indicated that the population structure of the core collection was similar to that of the original germplasm. In summary, the 43 resource core collection constructed in this study is well representative. Moreover, the feasibility of the two core collection construction strategies was determined by systematic evaluation. Future efforts should focus on strengthening germplasm resource collection, expanding the size of the germplasm pool, and utilizing accessions with richer genetic backgrounds for core collection development. These findings will provide a significant theoretical foundation and practical applications for the development and utilization of sugar beet germplasm resources.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/horticulturae11080990/s1, Table S1: 106 multigerm sugar beet germplasm resources numbers and name.; Table S2: Molecular markers used in this study.; Table S3: Core collection sampling results.

Author Contributions

Conceptualization, J.L. and Z.W.; data curation, J.L.; formal analysis, Y.S.; funding acquisition, Z.W.; investigation, J.L.; methodology, Z.W.; project administration, J.L.; resources, Z.W.; supervision, S.L., Z.P. and Z.W.; validation, S.L.; visualization, J.L.; writing—original draft, J.L.; writing—review and editing, Z.P. and Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Sugar Industry Modern Agricultural Industry Technology System (CARS-17) and the Heilongjiang Provincial Seed Industry Innovation and Development Project (Sugar Beet Joint Breeding Research).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We extend our sincere thanks to our fellow team members for their collaborative efforts, as well as to all who provided guidance and support throughout this project.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Keller, I.; Neuhaus, H. Innovations and threats facing the storage of sugar in sugar beet. Curr. Opin. Plant Biol. 2025, 85, 102721. [Google Scholar] [CrossRef]
  2. Jain, S.; Kumar, S. A comprehensive review of bioethanol production from diverse feedstocks: Current advancements and economic perspectives. Energy 2024, 296, 131130. [Google Scholar] [CrossRef]
  3. Pathak, A.; Srivastava, S.; Misra, V.; Mall, A.; Srivastava, S. Evolution and history of sugar beet in the world: An overview. In Sugar Beet Cultivation, Management and Processing; Springer: Singapore, 2022; pp. 3–10. [Google Scholar] [CrossRef]
  4. Geng, G.; Yang, J. Sugar Beet Production and Industry in China. Sugar Tech 2015, 17, 13–21. [Google Scholar] [CrossRef]
  5. Gu, R.; Fan, S.; Wei, S.; Li, J.; Zheng, S.; Liu, G. Developments on Core Collections of Plant Genetic Resources: Do We Know Enough? Forests 2023, 14, 926. [Google Scholar] [CrossRef]
  6. Odong, T.; Jansen, J.; Eeuwijk, F.; Hintum, L. Quality of core collections for effective utilisation of genetic resources review, discussion and interpretation. Theor. Appl. Genet. 2013, 126, 289–305. [Google Scholar] [CrossRef]
  7. Lyu, X.; Liu, G.; Li, Y.; Ji, Y.; Li, Y.; Li, Y.; Cheng, Y.; Wu, Z.; Zhang, X. Current Status and Prospects of Plant Core Germplasm Research. J. Plant Genet. Resour. 2025, 26, 1693–1707. [Google Scholar]
  8. Fregene, M.; Suarez, M.; Mkumbira, J.; Kulembeka, H.; Ndedya, E.; Kulaya, A.; Mitchel, S.; Gullberg, U.; Rosling, H.; Dixon, A. Simple sequence repeat marker diversity in cassava landraces: Genetic diversity and differentiation in an asexually propagated crop. Theor. Appl. Genet. 2003, 107, 1083–1093. [Google Scholar] [CrossRef]
  9. Amirul, I.F.; Beebe, S.; Muñoz, M.; Tohme, J.; Redden, R.; Basford, K. Using molecular markers to assess the effect of introgression on quantitative attributes of common bean in the Andean gene pool. Theor. Appl. Genet. 2004, 108, 243–252. [Google Scholar] [CrossRef] [PubMed]
  10. Frankel, O.H. Genetic perspectives of germplasm conservation. In Genetic Manipulation: Impact on Man and Society; Arber, W., Limensee, K., Peacock, W.J., Stralinger, P., Eds.; Cambridge University Press: Cambridge, UK, 1984; pp. 161–170. [Google Scholar]
  11. Brown, A. Core collections: A practical approach to genetic resources management. Genome 1999, 31, 818–824. [Google Scholar] [CrossRef]
  12. Chen, Y.; Liu, Y.; Zheng, S.; Cheng, X.; Guo, M.; LI, S.; Wang, M. Construction of a core collection of tomato (Solanum lycopersicum) germplasm based on phenotypic traits and SNP markers. Sci. Hortic. 2025, 339, 113855. [Google Scholar] [CrossRef]
  13. Guo, L.; Liao, T.; Wang, Y.; Cao, J.; Liu, G. Construction of a DNA fingerprint map and a core collection of Platycladus orientalis. J. Am. Soc. Hortic. Sci. 2024, 149, 142–151. [Google Scholar] [CrossRef]
  14. Xiong, M.; Wang, Y.; Chen, D.; Wang, X.; Zhou, D.; Wei, Z. Assessment of genetic diversity and identification of core germplasm in single-flowered amaryllis (Hippeastrum hybridum) using SRAP markers. Biotechnol. Biotechnol. Equip. 2020, 34, 966–974. [Google Scholar] [CrossRef]
  15. Certel, B.; İkten, H.; Yilmaz, Y.; Kantar, F.; Çiftçi, V.; Gözen, V.; Tepe, A. Molecular Characterization of Cold Tolerant Germplasm of Phaseolus Beans with Sequence Related Amplified Polymorphism (Srap) and Retrotransposon-Based Interprimer Binding Sites (Ipbs) Markers. J. Anim. Plant Sci. 2023, 33, 620–632. [Google Scholar] [CrossRef]
  16. Wang, X.; Cao, L.; Gao, J.; Li, K. Strategy for the construction of a core collection for Pinus yunnanensis Franch. to optimize timber based on combined phenotype and molecular marker data. Genet. Resour. Crop Evol. 2021, 68, 3219–3240. [Google Scholar] [CrossRef]
  17. Zhai, N.; Tang, J.; Zhou, J.; Zhou, C.; Wang, J.; Yun, Y.; Han, S.; Wang, Y.; Yan, W.; Xing, N. Genetic Diversity Analysis and Core Germplasm Construction of Oryza rufipogon Griff. in Hainan. J. Plant Genet. Resour. 2024, 25, 1624–1636. [Google Scholar] [CrossRef]
  18. Li, X.; Li, Y.; Hu, G.; Liu, Y.; Li, H.; Zhang, F.; Li, Y.; Wang, Y. Construction and Utilization of Applied Core Collection in Maize. J. Plant Genet. Resour. 2023, 24, 911–916. [Google Scholar] [CrossRef]
  19. Yan, T.; Zhang, W.; Zheng, J.; Guo, J.; Li, X.; Qiao, Y.; Chen, F.; Chang, F.; Zhang, J. Evaluation of Adult Stage Resistance of 57 Chinese Wheat Mini-Core Collections to Wheat Stripe Rust and Leaf Rust. J. Northeast. Agric. Sci. 2023, 48, 30–34. [Google Scholar] [CrossRef]
  20. Barański, R.; Grzebelus, D.; Frese, L. Estimation of genetic diversity in a collection of the Garden Beet Group. Euphytica 2001, 122, 19–29. [Google Scholar] [CrossRef]
  21. Abramoff, M.; Magalhães, P.; Ram, S. Image processing with ImageJ. Biophotonics Int. 2004, 11, 36–42. [Google Scholar]
  22. Reeves, P.; Panella, L.; Richards, C. Retention of agronomically important variation in germplasm core collections: Implications for allele mining. Theor. Appl. Genet. 2012, 124, 1155–1171. [Google Scholar] [CrossRef]
  23. Cui, P. Descriptors and Date Standard for Beet (Beta vulgaris L.); China Agriculture Press: Beijing, China, 2006; pp. 8–21. [Google Scholar]
  24. Li, J.; Wang, S.; Yu, J.; Wang, L.; Zhou, S. A Modified CTAB Protocol for Plant DNA Extraction. Chin. Bull. Bot. 2013, 48, 72–78. [Google Scholar] [CrossRef]
  25. Chen, G.; Zhu, X.; Zhang, F.; Zhu, J. Quantitative genetic analysis station for the genetic analysis of complex traits. Chin. Sci. Bull. 2012, 57, 2721–2726. [Google Scholar] [CrossRef]
  26. Karen, B.; Nancy, L.; Gene, G.; George, M. IBM SPSS for Introductory Statistics. Psychol. Methods Stat. 2019, 15, 266. [Google Scholar] [CrossRef]
  27. Yeh, C.; Yang, C.; Boyle, T. POPGENE, version 1.3.1; Microsoft Window-Bases Freeware for Population Genetic Analysis; University of Alberta and the Centre for International Forestry Research: Edmonton, AB, Canada, 1999. [Google Scholar]
  28. Tamura, K.; Stecher, G.; Kumar, S.; Battistuzzi, F.U. MEGA11: Molecular Evolutionary Genetics Analysis Version 11. Mol. Biol. Evol. 2021, 38, 30223027. [Google Scholar] [CrossRef]
  29. Kathy, M. Origin Update. Science 2000, 288, 1982. [Google Scholar] [CrossRef]
  30. Mondal, R.; Kumar, A.; Gnanesh, B.N. Crop germplasm: Current challenges, physiological-molecular perspective, and advance strategies towards development of climate-resilient crops. Heliyon 2023, 9, e12973. [Google Scholar] [CrossRef] [PubMed]
  31. Upadhyaya, H.; Gowda, L.; Sastry, D. Plant genetic resources management: Collection, characterization, conservation and utilization. J. SAT Agric. Res. 2008, 6, 16. [Google Scholar]
  32. Zhu, H.; Song, P.; Gu, D.; Guo, Q.; Li, Y.; Sun, S.; Weng, Y.; Yang, M. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis. BMC Genom. 2016, 17, 557. [Google Scholar] [CrossRef]
  33. Cobb, N.; Clerck, G.; Greenberg, A.; Randy, C.; Susan, C. Next-generation phenotyping: Requirements and strategies for enhancing our understanding of genotype–phenotype relationships and its relevance to crop improvement. Theor. Appl. Genet. 2013, 126, 867–887. [Google Scholar] [CrossRef]
  34. Kumar, A.; Kumar, S.; Singh, K.B.; Prasad, M.; Thakur, J.K. Designing a Mini-Core Collection Effectively Representing 3004 Diverse Rice Accessions. Plant Commun. 2020, 1, 100049. [Google Scholar] [CrossRef]
  35. Han, P.; Tian, X.; Wang, Y.; Huang, C.; Ma, Y.; Zhou, X.; Yu, Y.; Zhang, D.; Xu, H.; Cao, Y.; et al. Construction of a core germplasm bank of upland cotton (Gossypium hirsutum L.) based on phenotype, genotype and favorable alleles. Genet. Resour. Crop Evol. 2022, 69, 2309–2411. [Google Scholar] [CrossRef]
  36. Katinas, L.; Crisci, J. Agriculture Biogeography: An emerging discipline in search of a conceptual framework. Prog. Phys. Geogr. Earth Environ. 2018, 42, 513–529. [Google Scholar] [CrossRef]
  37. Mahmoodi, R.; Dadpour, M.R.; Hassani, D.; Zeinalabedini, M.; Vendramin, E.; Micali, S.; Nahandi, F.Z. Development of a core collection in Iranian walnut (Juglans regia L.) germplasm using the phenotypic diversity. Sci. Hortic. 2019, 249, 439–448. [Google Scholar] [CrossRef]
  38. Wang, C.; Hu, J.; Xu, M.; Zhang, S. A strategy on constructing core collections by least distance stepwise sampling. Theor. Appl. Genet. 2007, 115, 1–8. [Google Scholar] [CrossRef] [PubMed]
  39. Dean, R.; Mank, J. The role of sex chromosomes in sexual dimorphism: Discordance between molecular and phenotypic data. J. Evol. Biol. 2015, 7, 1443–1453. [Google Scholar] [CrossRef]
  40. Grover, A.; Sharma, P.C. Development and use of molecular markers: Past and present. Crit. Rev. Biotechnol. 2015, 36, 290–302. [Google Scholar] [CrossRef] [PubMed]
  41. Aneja, B.; Yadav, N.; Chawla, V.; Yadav, C. Sequence-related amplified polymorphism (SRAP) molecular marker system and its applications in crop improvement. Mol. Breed. 2012, 30, 1635–1648. [Google Scholar] [CrossRef]
Figure 1. Location of study area.
Figure 1. Location of study area.
Horticulturae 11 00990 g001
Figure 2. Clustering diagram of original germplasm based on morphology and molecular markers (A): based on morphology; (B): based on molecular markers.
Figure 2. Clustering diagram of original germplasm based on morphology and molecular markers (A): based on morphology; (B): based on molecular markers.
Horticulturae 11 00990 g002
Figure 3. Frequency distribution of 16 descriptive traits between core collection and original germplasm.
Figure 3. Frequency distribution of 16 descriptive traits between core collection and original germplasm.
Horticulturae 11 00990 g003
Figure 4. Principal component distribution of core collection and all accessions (A): based on morphology; (B): based on molecular markers.
Figure 4. Principal component distribution of core collection and all accessions (A): based on morphology; (B): based on molecular markers.
Horticulturae 11 00990 g004
Table 1. Sugar beet descriptive traits assignment table.
Table 1. Sugar beet descriptive traits assignment table.
TraitsQuantified Value
12345
Hypocotyl colorRedGreenMix--
Growth vigorVery weakWeakMediumVigorousVery vigorous
ChromosomeDiploidTetraploid---
Leaf colorLight greenGreenDark greenYellow green-
Leaf shapeShareTongue---
Leaf margin shapeFull marginSmall waveMedium waveBig wave-
Leaf surfaceSmoothWavySlight creaseMore creases-
Mesophyll thicknessThinMediumThick--
Petiole widthNarrowMediumWide--
Petiole lengthShortMedium---
Fascicled leaves typeErectSemicrawlCrawl--
Root shapeCuneiformConicalSpindleRegular-
Root grooveNoneNot obviousShallowDeep-
SkinVery smoothSmootherVery rough--
Flesh colorWhiteLight yellow---
Flesh coarsenessFineMediumCrude--
Table 2. Differential analysis of the sugar beet core collection constructed based on morphological data.
Table 2. Differential analysis of the sugar beet core collection constructed based on morphological data.
Genetic DistanceClustering MethodsRandom Sampling MethodDeviation Sampling Method
MD%VD%CR%VR%MD%VD%CR%VR%
Euclidean distanceSL010.3494.1956.143.4520.6995.3357.80
CL06.9091.1057.40013.7993.8058.03
MM03.4591.6256.47013.4591.6256.47
CM3.456.9094.2255.58010.3495.2757.28
UPA06.7088.9057.13017.2495.2859.03
WPA03.4589.6456.38017.2493.8058.16
FM06.9093.4055.64013.7993.6358.43
WM06.9091.9155.39017.2493.6658.24
Mahalanobis distanceSL3.4513.7993.7556.113.4513.7996.3958.53
CL06.9090.9056.503.4513.7995.3257.23
MM06.9090.3055.53010.3496.2358.79
CM3.453.4595.1257.903.4517.2496.9658.40
UPA03.4589.5356.096.9017.2492.7255.81
WPA03.4592.2856.79020.6995.0557.94
FM03.4490.3256.51020.6994.7657.59
WM03.4584.0452.583.4520.6995.3357.80
Table 3. Differential analysis of the sugar beet core collection constructed based on molecular marker data.
Table 3. Differential analysis of the sugar beet core collection constructed based on molecular marker data.
Genetic DistanceClustering MethodsRandom Sampling MethodDeviation Sampling Method
VPHeHAVPHeHA
Nei and LiSL0.930.550.373.020.930.550.373.02
CL0.930.570.392.710.930.570.392.72
MM0.920.550.373.100.930.550.373.10
CM0.910.550.372.980.920.550.373.00
UPA0.930.570.382.830.920.580.402.72
WPA0.910.560.382.890.940.560.382.90
FM0.910.570.392.860.930.580.392.89
WM0.920.610.422.560.930.580.392.87
JaccardSL0.940.550.373.020.930.550.373.02
CL0.940.570.392.710.920.560.382.71
MM0.940.550.363.060.920.540.363.06
CM0.930.540.363.060.920.530.353.30
UPA0.940.580.402.870.930.580.392.87
WPA0.940.560.382.890.920.560.382.89
FM0.930.580.392.870.920.580.392.87
WM0.930.570.372.860.920.560.362.86
Simple matchingSL0.930.540.432.400.940.540.422.40
CL0.930.530.422.520.940.530.422.52
MM0.920.540.432.400.930.540.422.37
CM0.920.550.442.340.940.550.442.35
UPA0.930.550.432.380.920.540.432.36
WPA0.920.550.422.110.920.550.422.37
FM0.920.530.422.470.920.530.422.43
WM0.910.540.422.560.920.550.422.53
Table 4. Comparative analysis of sugar beet core collections at varying sampling proportions based on morphological data.
Table 4. Comparative analysis of sugar beet core collections at varying sampling proportions based on morphological data.
Sampling Proportion%MD%VD%CR%VR%
5.00020.6977.9956.93
10.00024.1388.7058.12
15.00013.7990.3058.56
20.003.4517.2492.9858.30
25.00010.3594.2859.03
30.00010.3594.2857.51
Table 5. Comparative analysis of sugar beet core collections at varying sampling proportions based on molecular marker data.
Table 5. Comparative analysis of sugar beet core collections at varying sampling proportions based on molecular marker data.
Sampling Proportion%VPHeHA
5.000.910.460.312.19
10.000.930.540.362.81
15.000.930.560.382.83
20.000.940.570.392.84
25.000.940.580.402.72
30.000.940.590.412.67
Table 6. Proportional allocation of core collection sampling based on cluster groups.
Table 6. Proportional allocation of core collection sampling based on cluster groups.
CategoryGroupSample SizeExtraction ProportionExtraction Count
MorphologyI3622.22%8
II1625%4
III1421.43%3
IV4030%12
Molecular MarkersI333.33%1
II4211.90%5
III757.14%4
IV5420.37%11
Total-10640.57%43
Table 7. Comparison of 29 phenotypic traits between core collection and original germplasm.
Table 7. Comparison of 29 phenotypic traits between core collection and original germplasm.
TraitsMeanVarianceRangeCV%H’t-Test
Original CoreOriginalCoreOriginalCoreOriginalCoreOriginalCore
SW356.2381.11.41.9635.8553.833.036.01.91.9NS
HY1.91.80.80.82.02.046.650.01.11.0NS
GV3.73.81.10.84.03.028.423.01.41.2NS
CH1.31.20.20.21.01.035.433.40.60.5NS
LN17.617.06.38.613.013.014.317.22.01.9NS
LC1.71.60.60.33.02.047.335.91.00.8*
LS1.41.50.30.31.01.034.634.40.70.7NS
LM2.92.80.70.83.03.028.031.21.11.2NS
LS2.22.31.31.33.03.052.949.51.11.3NS
LT1.71.50.50.32.02.041.139.11.00.8NS
PW2.62.50.30.32.02.019.823.40.70.8NS
PL1.21.30.20.21.01.033.735.50.50.6NS
FL1.41.30.30.42.02.040.246.50.80.7NS
RW8.48.22.03.48.18.116.922.51.81.9*
RL23.423.87.68.813.411.411.712.52.01.9NS
AR2.83.00.10.21.81.812.113.12.01.8NS
VB6.16.20.40.52.92.910.012.02.02.1NS
RY36,69837,2518.09.450,29450,29424.426.12.01.9NS
SY560556532.52.97535715926.229.92.01.8NS
BR20.1 19.95.79.712.912.911.915.62.02.1*
SC15.716.513.319.837.637.623.338.21.61.7NS
K+4.34.30.60.65.24.317.917.71.71.5NS
Na+3.43.62.33.47.67.644.651.21.91.8NS
α-N5.05.11.51.85.95.924.126.42.02.0NS
RS1.81.70.70.53.02.045.144.01.10.5NS
RG2.92.90.81.23.03.032.038.51.31.9NS
SK2.32.40.50.62.02.030.031.31.00.8NS
FC1.41.30.20.21.01.035.435.90.70.7NS
FL2.12.20.60.52.02.036.733.71.11.2NS
Footer: seedling weight (SW); hypocotyl (HY); growth vigor (GV); chromosome (CH); leaf number (LN); leaf color (LC); leaf shape (LS); leaf margin (LM); leaf surface (LS); life thickness (LT); petiole width (PW); petiole length (PL); fascicled leaves (FL); root width (RW); root length (RL); aspect ratio (AR); vascular bundle (VB); root yield (RY); sugar yield (SY); brix (BR); sugar content (SC); K+ content (K+); Na+ content (Na+); α-N content (α-N); root shape (RS); root groove (RG); skin (SK); flesh color (FC); flesh coarseness (FL); NS: not significance; *: p < 0.05.
Table 8. Comparison of genetic diversity between core collection and original germplasm.
Table 8. Comparison of genetic diversity between core collection and original germplasm.
IndexOriginal GermplasmCore Collection
Na1.74921.7617
Ne1.42561.4236
H0.26140.2551
I0.37720.3875
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, J.; Song, Y.; Li, S.; Pi, Z.; Wu, Z. Developing Chinese Sugar Beet Core Collection: Comprehensive Analysis Based on Morphology and Molecular Markers. Horticulturae 2025, 11, 990. https://doi.org/10.3390/horticulturae11080990

AMA Style

Li J, Song Y, Li S, Pi Z, Wu Z. Developing Chinese Sugar Beet Core Collection: Comprehensive Analysis Based on Morphology and Molecular Markers. Horticulturae. 2025; 11(8):990. https://doi.org/10.3390/horticulturae11080990

Chicago/Turabian Style

Li, Jinghao, Yue Song, Shengnan Li, Zhi Pi, and Zedong Wu. 2025. "Developing Chinese Sugar Beet Core Collection: Comprehensive Analysis Based on Morphology and Molecular Markers" Horticulturae 11, no. 8: 990. https://doi.org/10.3390/horticulturae11080990

APA Style

Li, J., Song, Y., Li, S., Pi, Z., & Wu, Z. (2025). Developing Chinese Sugar Beet Core Collection: Comprehensive Analysis Based on Morphology and Molecular Markers. Horticulturae, 11(8), 990. https://doi.org/10.3390/horticulturae11080990

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop