Next Article in Journal
Dose-Specific Biochar Effects on Cotton Yield Under Drought: Genotypic Variations in the Arid U.S. Cotton Belt
Previous Article in Journal
Air and Spray Pattern Characterization of Multi-Fan Autonomous Unmanned Ground Vehicle Sprayer Adapted for Modern Orchard Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Development and Application of an LDR-Based SNP Panel for High-Resolution Genotyping and Variety Identification in Sugarcane

1
National Engineering Research Center for Sugarcane, Fujian Agriculture and Forestry University, Fuzhou 350002, China
2
Yunnan Key Laboratory of Sugarcane Genetic Improvement, Sugarcane Research Institute, Yunnan Academy of Agricultural Sciences, Kaiyuan 661600, China
3
Guangxi Key Laboratory for Sugarcane Biology, Guangxi University, Nanning 530005, China
*
Authors to whom correspondence should be addressed.
Agronomy 2026, 16(3), 343; https://doi.org/10.3390/agronomy16030343
Submission received: 28 November 2025 / Revised: 9 January 2026 / Accepted: 27 January 2026 / Published: 30 January 2026
(This article belongs to the Section Crop Breeding and Genetics)

Abstract

Sugarcane (Saccharum spp. L.) is a globally vital sugar and energy crop whose genetic improvement has been constrained by its complex polyploid–allopolyploid genome. To address this limitation, we developed a practical, high-throughput single-nucleotide polymorphism (SNP) genotyping system. Using specific-locus amplified fragment sequencing (SLAF-seq) on 107 diverse accessions, we identified 2,420,550 high-quality SNPs anchored to the Saccharum officinarum LA-Purple genome. Stringent filtering yielded 55,750 SNPs for population analysis, which revealed three distinct genetic groups consistent with breeding history and adaptation. From these resources, 329 SNPs were converted into PCR-based ligase detection reaction (PCR-LDR) markers, resulting in a validated panel of 177 highly reliable SNPs (151 core and 26 extended) organized into an efficient multiplex typing system. The panel exhibited exceptional discriminatory power, successfully distinguishing all 303 tested sugarcane varieties and clearly resolving 186 individuals from three segregated hybrid populations. Compared to existing SSR and SNaPshot platforms, this SNP system offers superior experimental reproducibility, enhanced varietal clustering, and broader genome coverage. This work provides a robust, efficient genotyping tool to advance sugarcane variety identification, germplasm management, pedigree analysis, and marker-assisted breeding, with potential applicability to other complex polyploid crops.

1. Introduction

Against the backdrop of a growing global population and intensifying climate change, ensuring food security has emerged as a major challenge for modern agriculture [1]. With the world population projected to reach 10 billion by 2050, crop breeding must achieve breakthroughs in enhancing both yield and stress tolerance [2]. Traditional breeding methods, characterized by lengthy cycles and limited efficiency—particularly in deciphering complex quantitative traits—struggle to meet the demands of contemporary agricultural development [3].
The advancement of molecular marker technology has provided innovative solutions for crop genetic improvement. Single-nucleotide polymorphisms (SNPs), as third-generation molecular markers, offer advantages including widespread genomic distribution, high genetic stability, and suitability for high-throughput detection [4,5]. They have been extensively applied in constructing genetic maps, quantitative trait locus (QTL) mapping, genome-wide association studies (GWAS), and genomic selection [6,7,8,9]. Nevertheless, in polyploid crops, the development and application of SNP markers remain significantly hampered by the complexity of their genomic architectures [6].
The development of efficient and practical molecular marker systems is pivotal for accelerating the genetic improvement in sugarcane (Saccharum spp.).
(i) The polyploid challenge in sugarcane: Its highly complex polyploid–allopolyploid genome presents significant obstacles to the development and application of molecular markers, particularly hindering the construction of cost-effective, high-precision genotyping systems [10,11,12].
(ii) The existing gap and the core hypothesis of this study: Although substantial sugarcane single-nucleotide polymorphism (SNP) resources have been identified via reduced-representation genome sequencing (e.g., Specific-Locus Amplified Fragment Sequencing, SLAF-seq), these resources have not been adequately translated into practical, high-throughput, low-cost, and user-friendly genotyping panels suitable for routine breeding applications [13,14]. Concurrently, established techniques such as simple sequence repeats (SSRs; limited throughput, variable reproducibility), SNaPshot (high cost, stringent instrumentation requirements), or high-density SNP arrays (prohibitive cost, complex data analysis) are ill-suited for large-scale germplasm fingerprinting and progeny screening [15,16]. Therefore, we hypothesized that converting the large-scale genomic variants identified via SLAF-seq into a minimal set of core SNP markers based on a PCR–ligase detection reaction (PCR-LDR) platform could establish a genotyping system that combines high discriminative power, excellent reproducibility, low cost, and ease of deployment in breeding laboratories, thereby surpassing the limitations of existing marker technologies in practicality, throughput, or expense [17,18].
(iii) Research objective and novelty: This study aimed to develop and systematically validate a high-performance SNP genotyping panel based on PCR-LDR [19]. Its core novelty is threefold: First, compared to existing SLAF-seq-derived resources, this work accomplishes the complete pipeline from “genome-wide SNP discovery” to “a validated, practical PCR-LDR marker system”, moving beyond population genetics analysis alone. Second, compared to previously reported sugarcane SNP, SSR, or SNaPshot systems, this panel, through optimized multiplex PCR-LDR design, achieves efficient multiplexing of 177 SNPs (151 core markers) for the first time in sugarcane. It demonstrates 100% resolution and high reproducibility across 303 cultivars and 186 hybrid progeny, while significantly reducing per-sample cost and instrumentation demands compared to sequencing or array-based solutions. Third, in terms of application relevance, the marker set was developed based on the S. officinarum genome, providing superior discriminative power and improved clustering by geographical origin for major Chinese cultivars and breeding lines. This panel offers an innovative and practical solution for sugarcane variety identification, hybrid progeny verification, and molecular breeding.

2. Materials and Methods

2.1. Experimental Materials

2.1.1. Description of Samples

The SLAF-seq materials for this study were obtained from the Sugarcane Germplasm Resource Nursery of Fujian Agriculture and Forestry University, with a total of 107 samples (Table S1), of which 24 were Saccharum spontaneum, and the rest were cultivars from different regions. In addition, the 303 samples used for SNP genotyping covered the main registered sugarcane varieties in China, the main parents, wild materials and foreign varieties. The segregated population materials used were obtained from the Resource Nursery of Fujian University of Agriculture and Forestry, and the samples were taken from the tillering stage of the cultivated samples in the field. For each sample, three fresh, disease-free, fully expanded leaves were taken, packaged in self-sealing bags, and placed at −80 °C for storage for subsequent DNA extraction.

2.1.2. Genomic DNA Extraction

A 0.2 g sample was taken from fresh leaves and genomic DNA was extracted using the CTAB method. The extracted DNA was detected by 1.5% agarose gel electrophoresis to ensure that the OD260/280 values of the sample DNA were between 1.8 and 2.0. The DNA concentration was determined by an enzyme marker, and the DNA was diluted to 50 ng/μL and stored in a −20 °C freezer for subsequent use.

2.2. SLAF Sequencing and SNP Variant Comparison

2.2.1. Reference Genome Selection

The haplotype-resolved reference genome of S. officinarum LA-Purple was used for read alignment and variant calling. This genome was selected because it represents one of the earliest released chromosome-scale assemblies for sugarcane with high continuity and annotation quality, and has been widely adopted in previous genomic studies, facilitating comparative analysis. It is noteworthy that modern sugarcane cultivars are interspecific hybrids derived from S. officinarum (contributing high sugar content) and S. spontaneum (contributing stress resilience) [12,20]. Therefore, using a single ancestral (S. officinarum) reference may introduce bias in detecting alleles originating from the S. spontaneum subgenome, which could affect the comprehensiveness of SNP discovery and subsequent population genetic inferences. This potential bias and its implications are further addressed in the Discussion section.

2.2.2. Sequencing and Alignment

Extracted genomic DNA was enzymatically digested to obtain SLAFs. After enzymatic cleavage of all sample DNAs, the target fragments were selected by 3′-end plus A treatment, ligation of Dual-index sequencing junction, PCR amplification, purification, sample mixing, and gel cutting, and sequencing was performed after the library quality check was qualified. The raw reads were further processed with a bioinformatic pipelinetool, BMKCloud (www.biocloud.net) online platform. Raw data (raw reads) of fastq format were firstly processed through fastp (v0.21.0) software (fastp -q 10 -u 50 -y -g -Y 10 -e 20 -l 100 -b 150 -B 150). In this step, clean data (clean reads) were obtained by removing reads containing adapter, reads containing poly-N and low-quality reads from raw data. At the same time, Q30, GC-content of the clean data were calculated. All the downstream analyses were based on clean data with high quality. The adaptor sequences and low-quality sequence reads were removed from the data sets. Raw sequences were transformed into clean reads after data processing. These clean reads were then mapped to the reference genome sequence. bwa (v0.7.10-r789) software (bwa mem -t 10 -M) was used were used to map with reference genome (S. officinarum LA-Purple, https://sugarcane.gxu.edu.cn/scdb/ (accessed on 6 April 2023). The SNP calling was performed using GATK (v3.8) (java -XX:ParallelGCThreads=5 -Djava.io.tmpdir=tmp_dir -Xmx50G -jar GenomeAnalysisTK -T UnifiedGenotyper -glm BOTH -mte --sample_ploidy 2 -I input.bam -o output.vcf) and SAMtools (v1.9.1) (samtools mpileup -t DP,SP,DP4 -vf) packages, and the resulting intersection of both call sets was used as the final high-confidence SNP set. SNP annotation was performed on the basis of the reference genome using SnpEff (v3.6c) software (default), and SNPs were categorized into intergenic regions, upstream or downstream regions, and exons or introns. SNPs in coding exons were further classified as synonymous SNPs or nonsynonymous SNPs.

2.3. SNP Molecular Marker Development

2.3.1. High-Quality SNP Filtering

To obtain a high-quality SNP dataset for downstream analyses, we implemented a multi-step, stringent filtering and quality control pipeline. Following the initial SNP calling using GATK (v3.8) and SAMtools (v1.9.1), the resulting variant set was processed with bcftools (v1.9) and vcftools (v0.1.16) for filtering and format conversion. The raw VCF file was compressed with bgzip and indexed using bcftools index and tabix to expedite processing. The initial dataset comprised 2,420,550 SNPs.
Filtering proceeded in three sequential steps, with parameters selected based on standards widely adopted in population genetic studies:
Sample-Level and Frequency-Based Filtering: Using bcftools (v1.9), SNPs with a missing rate (F_MISSING) > 0.3 or a minor allele frequency (MAF) < 0.05 were removed to retain variants with sufficient representation and informativeness across the population. This step reduced the SNP count to 197,994.
Variant Quality Filtering: A second round of filtering with bcftools (v1.9) was applied using the following quality thresholds: QUAL < 30.0, QD < 2.0, FS > 60.0, MQ < 40.0, or SOR > 4.0. This step eliminated SNPs with low sequencing quality or poor model fit, resulting in 91,054 SNPs.
Hardy–Weinberg Equilibrium (HWE) Filtering: The dataset was converted to PLINK format using vcftools (v0.1.16). Subsequently, PLINK (v1.9) was used to exclude SNPs that showed significant deviation from Hardy–Weinberg equilibrium (HWE p-value < 0.01), thereby removing loci with potential genotyping errors or strong selective pressures. The final high-quality dataset consisted of 55,750 SNPs, which were used for subsequent population genetic analyses and marker development.
To ensure compatibility with different analytical software, chromosome identifiers were standardized using bcftools annotate during format conversions. All bioinformatic analyses were conducted in a Linux environment, with essential software paths configured in the environment variables to ensure pipeline execution.

2.3.2. Multiple Marker Typing System

The high-quality SNP markers were further screened based on their physical locations. Specifically, loci were uniformly selected where no other variations were present within 100 bp upstream and downstream along each chromosome, ensuring marker specificity. The 100-bp flanking sequences of these candidate loci were then subjected to homology comparison, and loci with high sequence homology were excluded. Following this stringent selection process, 329 SNPs were retained and subjected to PCR-LDR conversion, ultimately resulting in the successful conversion of 191 markers. These markers were organized into four multiplex typing systems to enhance genotyping throughput. Detailed information on the SNP loci and corresponding primer concentrations for each system is provided in Supplementary Table S2.

2.3.3. Multiple SNP Typing Kit

Amplification primers, splice primers (see Tables S3 and S4 for primer information) and LDR probes are designed based on the sequence information of the target SNPs. The amplification primers are used to amplify the target region containing the SNP locus, while the LDR probe specifically recognises both alleles of the SNP (the reference allele and the variant allele). The design ensures that the end of the probe matches the allele exactly. The fluorescent signal intensity of the PCR-LDR is analysed in conjunction with the genotype data to determine the SNP genotypes in different samples.

2.3.4. SNP Filtering and “Diploidization” Processing for the Polyploid Genome

To circumvent interference from homologous sequences and ensure absolute genotyping accuracy within the polyploid sugarcane genome, we performed stringent single-copy filtering on candidate SNP loci, prioritized by physical location, prior to designing PCR-LDR primers and probes. Specifically, the 100-bp flanking sequences upstream and downstream of each candidate SNP were aligned against the S. officinarum LA-Purple reference genome. Only loci exhibiting a unique best-match position (i.e., single-copy sequences) or displaying low sequence homology across the entire genome were retained. Furthermore, PCR primer design deliberately avoided regions of high homology. The loci selected through this filtering process consistently exhibited a clear biallelic pattern across all tested materials, thereby excluding the possibility of complex multi-allelic signals arising from variable dosages at multiple homologous loci. Consequently, the SNPs ultimately used to construct the marker panel can be genetically regarded as “diploidized” sites within the complex genomic context of sugarcane. This approach fundamentally addresses the common challenges of ambiguous allelic dosage and homologous sequence interference in polyploid species, laying a robust foundation for subsequent high-precision, highly reproducible genotyping based on LDR.

2.4. Population Genetic Analysis

Based on the identified high quality SNP marker data, genetic distances between samples were calculated using bcftools and plink software and visualised by R packages (ape, ggplot2) to construct a neighbour-joining (NJ) phylogenetic tree. Population (K-value) selection was performed using the R (v4.2.2) package (LEA) and the optimal K-value was determined by cross-validation and further visualised using TBtools (v2.363). Population genetic structure was studied in depth by PCA and observation of the relationship between heterozygosity and expected heterozygosity, revealing genetic consistency within populations and genetic differentiation between populations.

2.5. SNP Data Analysis

303 sugarcane samples (Table S5) and 186 genomic DNA samples from sugarcane segregating populations were genotyped using the LDR Multiple SNP typing kit, and the genotyping data were used to construct a phylogenetic tree by the neighbour-joining (NJ) method. The phylogenetic tree analysis further validated the effectiveness of SNP markers in variety identification and population genetic analysis.

3. Results

3.1. SLAF-seq and Reference Genome Comparison

Following quality control, 107 sugarcane genomic DNA libraries were constructed for SLAF-seq. A total of 451.36 million reads were generated, with an average Q30 score of 90.94% and an average GC content of 44.12% (Table S6). Using the S. officinarum LA-Purple genome as a reference, we identified 2,420,550 raw SNP variants (Table S7).
The distribution of SLAF tags (Figure 1A) and SNPs (Figure 1B) across chromosomes was heterogeneous, indicating uneven genomic coverage which may reflect regions of differential accessibility, structural variation, or varying evolutionary constraints.

3.2. High-Quality SNP Screening

The 2,420,550 SNPs obtained after sequencing were screened according to the filtering conditions to obtain 55,750 SNPs for subsequent analyses. We analysed these 55,750 SNPs for minor allele frequency (MAF) distribution (Figure 2A) and observed heterozygosity versus expected heterozygosity (Figure 2B).
In the filtered SNP set, low-frequency alleles (MAF < 0.1) predominated (Figure 2A). This distribution pattern may be attributed to characteristics of the sugarcane materials studied: as a long-term domesticated, hybridized, and selected polyploid crop, the breeding population may have undergone bottleneck effects, and many beneficial or neutral mutations may not yet have reached high frequency. Additionally, the inherent regional bias of SLAF-seq technology in genome coverage may also influence the detection of allele frequencies. Nevertheless, these widely distributed SNPs, including low-frequency loci, provide rich polymorphic information for high-resolution genetic analysis.
Comparison of the relationship between observed heterozygosity and expected heterozygosity showed that most of the data points were close to the diagonal line, indicating a high degree of consistency and dissociation between observed heterozygosity and expected heterozygosity [21]. The above results indicate that the high-quality SNPs we screened meet the requirements for subsequent analyses.

3.3. Population Genetic Diversity

We performed neighbor-joining (NJ) phylogenetic tree, PCA and population structure analysis on 107 samples using 55,750 high-quality SNPs, where the neighbor-joining (NJ) phylogenetic tree (Figure 3A) shows the genetic relationships among all the samples, grouping the samples based on genetic distance. The results show that 107 samples could be classified into three main branches corresponding to GroupI (red), GroupII (green) and GroupIII (blue). The clustering of the samples on the branches showed high genetic similarity within the populations, as well as significant genetic differences between the populations. The results of PCA (Figure 3B) demonstrated the distribution of the samples on the first two principal components (PC1 and PC2). The samples were grouped according to the groups (GroupI, GroupII, and GroupIII) showed significant clustering, which further confirmed the segregation of the populations trend, showing that the genetic structure was significantly different between groups. The trend of K value of cross-validation error (Figure 3C) shows that the error value decreased significantly from K = 2 to K = 4, and then levelled off, which indicated that K = 3 was the best grouping scheme. Neighbor-joining (NJ) phylogenetic tree, PCA and population structure analysis together revealed the population genetic structure of the study population. The delineation of the three populations (Group I, Group II, and Group III) was highly consistent across analyses, providing evidence of mutual validation for the population structure analyses, suggesting that the three populations are clearly genetically segregated, and that the tight clustering within the populations reflects the high degree of genetic similarity within the populations. The mixing ratio information of the population structure maps also revealed possible gene flow or genetic mixing in some of the samples.
Population structure analysis revealed that the 107 sugarcane accessions could be clearly classified into three distinct genetic groups. Group I (red) predominantly consists of modern high-sucrose cultivars widely grown in China, primarily originating from Fujian, Guangdong, and Guangxi provinces, suggesting a genetic background largely derived from S. officinarum. Group II (green) is almost exclusively composed of wild S. spontaneum materials, demonstrating significant genetic differentiation from cultivated varieties. Group III (blue) includes early-bred cultivars, introduced accessions, and some landraces, likely representing intermediate or admixed genetic backgrounds. This structure aligns with the breeding history of sugarcane, reflecting genomic-level differentiation among materials with varying ancestries.

3.4. SNP Marker Development

Based on the stringent selection criteria of physical distance and homology comparison results, we refined the initial set of 55,750 high-quality SNPs to 329 candidate markers for PCR-LDR conversion. Among these, 191 markers were successfully developed into functional PCR-LDR assays. Subsequent genotyping of the 303 accessions using the LDR method led to the exclusion of markers with elevated missing data rates (SNP loci missing in >30% of the 303 accessions were excluded. This threshold aligns with conventional standards for SNP filtering in polyploid crops [22].), yielding a final panel of 177 robust SNP markers (detailed in Table S8). To address the challenges posed by the polyploid sugarcane genome, we ensured the specificity and biallelic nature of the final 177 core SNP marker loci through stringent single-copy filtering and ‘diploidization’ processing of candidate SNPs (detailed in Section 2.3.4). According to polymorphism differences, this panel was further classified into 151 core and 26 extended SNP markers. For high-throughput application, the 177 markers were organized into four multiplex PCR systems, with corresponding SNP loci and primer concentrations specified in Table S2.
The genotyping data from the 303 accessions were used to construct a neighbor-joining (NJ) phylogenetic tree (Figure 4B). Impressively, a subset of 151 markers was sufficient to completely distinguish all 303 varieties, highlighting the panel’s high discriminative power. To further validate the panel’s accuracy and applicability in breeding contexts, we performed genotyping on three segregated hybrid progeny populations: Combination 1 (Gui Tang 00-122 × Yun Zhe 16-1005; 60 individuals), Combination 2 (Yue Tang 93-159 × Yacheng 05-64; 64 individuals), and Combination 3 (Yue Tang 00-236 × Liucheng 15-37; 62 individuals), totaling 186 individuals. The NJ phylogenetic tree constructed using the 151 core SNP markers (Figure 4C) successfully resolved all individuals, with progeny from the same parental combination forming distinct clusters. This result underscores the high quality, reproducibility, and practical utility of the 151 core SNP markers for genetic identification and analysis in sugarcane breeding programs.

3.5. Variety Authenticity Testing

We combined the present study in the laboratory in the previous period with the SSR markers developed and the SNaPShot markers based on the genome of S. spontaneum AP85-441 for comparative analysis.
Firstly, distinct marker technologies exhibit significant differences in operational characteristics. The LDR-SNP marker system developed in this study demonstrates high specificity and favourable experimental reproducibility. In contrast, while the 21 SSR primers employed could genotype 131 loci, their experimental workflow proved complex and electrophoresis results showed poor stability, rendering them unsuitable for high-throughput applications. The SNaPShot technique, meanwhile, imposes stringent requirements on sample quality and experimental conditions, coupled with relatively high equipment costs.
Secondly, SNP markers derived from different reference genomes exhibit varying identification resolutions. Using 62 shared materials (see Table S9 for information on the materials), we constructed neighbour-joining (NJ) trees based on the three marker types. While all markers distinguished different cultivars, their clustering efficacy differed. LDR-SNP markers based on the S. officinarum LA-Purple genome clearly reflected geographical origin information when differentiating materials.
Using high-quality SNPs derived from the S. spontaneum AP85-441 reference genome, we obtained 53,669 high-quality SNPs after applying the same filtering criteria described in Methods 2.3.1. Based on these SNPs, we performed cluster analysis (Figure 5D) and PCA (Figure 5E) on the same set of 107 accessions. The results indicated that these SNPs partitioned the samples into only two genetic groups, whereas the high-quality SNPs based on the S. officinarum LA-Purple reference genome resolved the same samples into three distinct clusters. This comparison demonstrates that the S. spontaneum-based markers provide lower classification resolution than those derived from the S. officinarum reference genome, particularly for distinguishing fine-scale population structure in cultivated sugarcane.

3.6. Performance Evaluation of the SNP Marker Panel

To objectively assess the performance of the SNP marker panel developed in this study, we calculated a series of key quantitative metrics.
Polymorphism Information Content (PIC): The average PIC for the 151 core markers was 0.29, indicating moderate to good levels of polymorphism. Specifically, among the core markers, 74 (49.0%) exhibited high informativeness (PIC ≥ 0.3), and 47 (31.1%) displayed very high informativeness (PIC ≥ 0.4). This distribution demonstrates that the core marker set possesses sufficient overall genetic information to support effective variety discrimination (specific values are provided in Table S10).
Genotyping Call Rate: The marker panel demonstrated good experimental robustness. Across the 303 varieties, the average call rate for all 177 markers was 93%. Analysis of call rates for individual samples revealed that 225 varieties (74.3% of the total) achieved call rates exceeding 95%. This result indicates the high reliability of the genotyping system for the majority of materials.
Discriminatory Power: The genetic distance matrix constructed using the 151 core markers confirmed the panel’s exceptional discriminatory capacity. Pairwise genetic distances between all 303 varieties were greater than zero. The neighbor-joining phylogenetic tree (Figure 4B) built from this distance matrix placed each variety on a distinct terminal branch, with no polytomies or overlaps observed in the topology, providing visual confirmation that the panel can completely differentiate all tested varieties.
Comprehensive Performance Assessment: Although the average PIC value (0.29) is slightly lower than those reported for some high-density SNP panels in diploid crops, when considered in conjunction with its high average call rate (93%) and complete discriminatory power (100% variety distinction), the SNP panel developed in this study achieves an optimal balance of practicality, robustness, and discriminatory power within the context of sugarcane’s complex polyploid genome. Its performance is fully adequate to meet the practical demands of sugarcane variety identification, germplasm resource management, and authenticity verification of breeding materials.

4. Discussion

4.1. Application and Challenges of SLAF-seq Technology in the Complex Sugarcane Genome

As an efficient and specific method for molecular marker development, SLAF-seq demonstrates clear advantages in studying complex polyploid genomes [23,24]. In this research, we successfully obtained 2,420,550 high-quality SNPs by applying this technique to sequence 107 sugarcane germplasm accessions. However, repetitive restriction enzyme cutting sites and homologous sequence interference in polyploid genomes may affect marker specificity. To address this, we optimized enzyme digestion design, integrated reference genome information, and developed tailored bioinformatic workflows, thereby enhancing the accuracy and utility of the SNP markers. The uneven distribution of SLAF-seq tags and SNPs across the genome may reflect structural variations or selective hotspots, which could serve as key regions for future fine-mapping of QTLs and screening of candidate genes [25].

4.2. A “Diploidized” SNP Screening Strategy to Address Polyploid Challenges

The widespread presence of homologous sequences and multi-allelic sites in the polyploid sugarcane genome poses significant challenges for molecular marker development [26]. Unlike the common “pseudo-diploid” simplification strategy, this study implemented stringent pre-screening. During marker development, we systematically identified SNP loci located in single-copy (single-dose) genomic regions through whole-genome BLAST alignment and confirmed their clear biallelic patterns in sequencing data. This critical step means that the final SNP marker set developed here is not a simplification of complex signals, but originates from loci within the sugarcane genome that inherently exhibit diploid-like genetic behavior. This strategy not only avoids interference from homologous sequences and ambiguous allele dosage interpretation, ensuring the specificity and reproducibility of PCR-LDR detection, but also enables subsequent population genetic analyses to be conducted under standard diploid models, thereby improving the reliability and interpretability of the results. This approach provides a feasible pathway for practical marker development in other polyploid crops.

4.3. Implications of Reference Genome Selection and Empirical Validation

Beyond the marker screening strategies for polyploid genomes discussed above, the selection of an appropriate reference genome is itself a critical factor influencing the efficacy of SNP marker development. In this study, the S. officinarum LA-Purple genome was selected as the reference, primarily due to its status as one of the earliest high-quality, chromosome-scale assemblies [11]. However, we recognize that modern sugarcane cultivars are complex interspecific hybrids derived from S. officinarum (contributing high sugar content) and S. spontaneum (contributing stress tolerance). The use of a single ancestral (S. officinarum) reference genome may theoretically lead to reduced sensitivity in detecting alleles originating from the S. spontaneum subgenome, thereby potentially affecting the comprehensiveness of SNP discovery and subsequent population genetic inferences.
To evaluate the practical impact of this potential bias and validate the effectiveness of the marker system developed in this study, we conducted a direct comparative analysis (see Section 3.5). Using an identical filtering pipeline, we derived a set of SNPs from the S. spontaneum AP85-441 genome. Comparative results revealed that the SNP panel based on S. officinarum (Figure 5A) provided clearer clustering structures consistent with geographical origins when distinguishing major Chinese cultivars, whereas the SNP set based on S. spontaneum (Figure 5D,E) partitioned the same materials into only two genetic groups, exhibiting lower resolution.
This empirical evidence demonstrates that, despite the theoretical limitations, the SNP panel developed using the S. officinarum reference genome exhibits superior practicality and discriminatory power for the identification and genetic analysis of modern cultivars dominated by S. officinarum genetic backgrounds. This is likely because genomic regions associated with agronomic traits—particularly sugar metabolism—in modern high-sugar cultivars are predominantly inherited from S. officinarum. Thus, our selection is justified and effective at the practical application level.

4.4. Population Genetic Structure and Its Breeding Implications

Based on 55,750 high-quality SNPs, population analysis divided the 107 accessions into three genetic groups. This result was highly consistent across phylogenetic, PCA, and population structure analyses, with cross-validation supporting K = 3 as the optimal grouping. Group I primarily consists of modern high-yielding cultivars with a strong S. officinarum background, widely cultivated in major producing regions such as Guangxi, Guangdong, and Fujian. In contrast, Group II is almost entirely composed of wild S. spontaneum materials, serving as a key genetic reservoir for stress adaptation. Group III likely includes early-generation hybrids, landraces, and introduced materials, exhibiting varying degrees of wild ancestry admixture and often originating from ecologically diverse regions such as Yunnan and Sichuan. This structure aligns with the hybrid breeding history of sugarcane, which is based on S. officinarum and S. spontaneum, and reflects differences in geographical origins and breeding objectives [12,27]. This genetic framework provides a theoretical basis for parental selection, as choosing parents from different groups may help synergistically improve yield, sugar content, and stress tolerance [4,28,29].

4.5. Comparison of Molecular Marker Systems and Future Applications

Compared to the previously developed SSR markers and SNaPshot markers based on the S. spontaneum genome, the PCR-LDR system constructed in this study demonstrates advantages in throughput, reproducibility, cost, and clustering efficacy. SSR markers involve cumbersome procedures and low reproducibility, while SNaPshot requires high equipment and sample quality standards. In contrast, the PCR-LDR system, based on conventional PCR platforms, offers “high-throughput, moderate-cost” features suitable for large-scale genotyping of breeding materials [5,15,30,31,32]. This marker panel can not only be used for variety identification and infringement litigation but also supports hybrid progeny authentication, background selection in backcross breeding, and germplasm resource management. Although the current markers are neutral loci, they can be further developed into functional markers through trait association in the future, thereby advancing sugarcane molecular breeding toward greater precision and efficiency.

5. Conclusions

In this study, 107 sugarcane germplasm resources were analysed for population structure and genetic diversity using SLAF-seq technology. phylogenetic trees, PCA, and population structure analysis of sugarcane germplasm resources were performed using 55,750 high-quality SNPs, and the 107 sugarcane materials were classified into three populations with highly consistent results in different analyses. PCR-LDR was used to convert 329 candidate SNPs, yielding 177 validated SNP markers, including 151 core SNP markers and 26 extended SNP markers. Among them, 151 core markers could distinguish all 303 main varieties of sugarcane for testing and also all 186 segregating groups. The 151 core SNP markers developed based on the S. officinarum LA-Purple reference genome and the SSR markers developed in the pre-laboratory stage, as well as the SNP markers developed using S. spontaneum AP85-441 as the reference genome, were used to conduct variety authenticity tests on the same 62 materials test. Although all three markers were able to differentiate these materials, the SNP markers developed in this study performed better in terms of genome coverage and differentiating the geographical origin of the varieties.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16030343/s1, Table S1: 107 sugarcane materials used for SLAF-seq; Table S2: 4 multiple reaction systems corresponding to SNPs and primer concentrations; Table S3: Multiplex PCR reaction ligation primer sequences; Table S4: Primer sequences for multiple PCR reaction; Table S5: 303 sugarcane samples; Table S6: Sample SNP GC content statistics; Table S7: Sample SNP information statistics; Table S8: 177 SNP marker details; Table S9: 62 sugarcane materials tested for variety authenticity; Table S10: Summary of Polymorphism Information Content (PIC) for the 151 Core SNP Markers.

Author Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by W.Z., Y.W., Z.Y. and J.Z. The first draft of the manuscript was written by W.Z. and all authors commented on previous versions of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by [Guangxi Major Science and Technology Project] (Grant number [AA22117001]); [Yunnan Revitalization Talents Support Plan; Yunnan Key Laboratory of Sugarcane Genetic Improvement Open Project] (Grant number [2023KFKT003]); [National Sugar Industry Technical System Post Scientist Project] (Grant number [CARS-17]); [Innovation Fund of Fujian Agriculture and Forestry University] (Grant numbers [KFB23186A] and [KHF240002]).

Data Availability Statement

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Xin, Y.; Tao, F. Optimizing genotype-environment-management interactions to enhance productivity and eco-efficiency for wheat-maize rotation in the North China Plain. Sci. Total Environ. 2019, 654, 480–492. [Google Scholar] [CrossRef]
  2. Habib-Ur-Rahman, M.; Ahmad, A.; Raza, A.; Hasnain, M.U.; Alharby, H.F.; Alzahrani, Y.M.; Bamagoos, A.A.; Hakeem, K.R.; Ahmad, S.; Nasim, W.; et al. Impact of climate change on agricultural production; Issues, challenges, and opportunities in Asia. Front. Plant Sci. 2022, 13, 925548. [Google Scholar] [CrossRef]
  3. Nerkar, G.; Devarumath, S.; Purankar, M.; Kumar, A.; Valarmathi, R.; Devarumath, R.; Appunu, C. Advances in Crop Breeding Through Precision Genome Editing. Front. Genet. 2022, 13, 880195. [Google Scholar] [CrossRef]
  4. Li, J.; Chang, X.; Huang, Q.; Liu, P.; Zhao, X.; Li, F.; Wang, Y.; Chang, C. Construction of SNP fingerprint and population genetic analysis of honeysuckle germplasm resources in China. Front. Plant Sci. 2023, 14, 1080691. [Google Scholar] [CrossRef]
  5. Khlestkina, E.K.; Salina, E.A. SNP markers: Methods of analysis, ways of development, and comparison on an example of common wheat. Genetika 2006, 42, 725–736. [Google Scholar] [CrossRef]
  6. Lippolis, A.; Hollebrands, B.; Acierno, V.; de Jong, C.; Pouvreau, L.; Paulo, J.; Gezan, S.A.; Trindade, L.M. GWAS Identifies SNP Markers and Candidate Genes for Off-Flavours and Protein Content in Faba Bean (Vicia faba L.). Plants 2025, 14, 193. [Google Scholar] [CrossRef]
  7. Berdugo-Cely, J.A.; Martínez-Moncayo, C.; Lagos-Burbano, T.C. Genetic analysis of a potato (Solanum tuberosum L.) breeding collection for southern Colombia using Single Nucleotide Polymorphism (SNP) markers. PLoS ONE 2021, 16, e0248787. [Google Scholar] [CrossRef]
  8. Selga, C.; Chrominski, P.; Carlson-Nilsson, U.; Andersson, M.; Chawade, A.; Ortiz, R. Diversity and population structure of Nordic potato cultivars and breeding clones. BMC Plant Biol. 2022, 22, 350. [Google Scholar] [CrossRef]
  9. Vos, P.G.; Uitdewilligen, J.G.; Voorrips, R.E.; Visser, R.G.; van Eck, H.J. Development and analysis of a 20K SNP array for potato (Solanum tuberosum): An insight into the breeding history. Theor. Appl. Genet. 2015, 128, 2387–2401. [Google Scholar] [CrossRef]
  10. Yang, S.; Chu, N.; Feng, N.; Zhou, B.; Zhou, H.; Deng, Z.; Shen, X.; Zheng, D. Global Responses of Autopolyploid Sugarcane Badila (Saccharum officinarum L.) to Drought Stress Based on Comparative Transcriptome and Metabolome Profiling. Int. J. Mol. Sci. 2023, 24, 3856. [Google Scholar] [CrossRef]
  11. Garsmeur, O.; Droc, G.; Antonise, R.; Grimwood, J.; Potier, B.; Aitken, K.; Jenkins, J.; Martin, G.; Charron, C.; Hervouet, C.; et al. A mosaic monoploid reference sequence for the highly complex genome of sugarcane. Nat. Commun. 2018, 9, 2638. [Google Scholar] [CrossRef]
  12. Healey, A.L.; Garsmeur, O.; Lovell, J.T.; Shengquiang, S.; Sreedasyam, A.; Jenkins, J.; Plott, C.B.; Piperidis, N.; Pompidor, N.; Llaca, V.; et al. The complex polyploid genome architecture of sugarcane. Nature 2024, 628, 804–810. [Google Scholar] [CrossRef]
  13. Fang, H.; Liu, H.; Ma, R.; Liu, Y.; Li, J.; Yu, X.; Zhang, H.; Yang, Y.; Zhang, G. Genome-wide assessment of population structure and genetic diversity of Chinese Lou onion using specific length amplified fragment (SLAF) sequencing. PLoS ONE 2020, 15, e0231753. [Google Scholar] [CrossRef]
  14. Li, C.; Liu, M.; Sun, F.; Zhao, X.; He, M.; Li, T.; Lu, P.; Xu, Y. Genetic Divergence and Population Structure in Weedy and Cultivated Broomcorn Millets (Panicum miliaceum L.) Revealed by Specific-Locus Amplified Fragment Sequencing (SLAF-Seq). Front. Plant Sci. 2021, 12, 688444. [Google Scholar] [CrossRef]
  15. Nashima, K.; Hosaka, F.; Terakami, S.; Kunihisa, M.; Nishitani, C.; Moromizato, C.; Takeuchi, M.; Shoda, M.; Tarora, K.; Urasaki, N.; et al. SSR markers developed using next-generation sequencing technology in pineapple, Ananas comosus (L.) Merr. Breed. Sci. 2020, 70, 415–421. [Google Scholar] [CrossRef]
  16. Xiao, X.O.; Zhang, N.; Jin, H.; Si, H. Genetic Analysis of Potato Breeding Collection Using Single-Nucleotide Polymorphism (SNP) Markers. Plants 2023, 12, 1895. [Google Scholar] [CrossRef]
  17. Bhardwaj, P.; Sinha, S.; Yadav, R.K. Medical and scientific writing: Time to go lean and mean. Perspect. Clin. Res. 2017, 8, 113–117. [Google Scholar] [CrossRef]
  18. Huangwei, C.; Rongjian, T.; Fuan, N.; Jihua, Z.; Bin, S.; Zhongyong, L.; Can, C.; Liming, C. A New PCR/LDR-Based Multiplex Functional Molecular Marker for Marker-Assisted Breeding in Rice. Rice Sci. 2021, 28, 6–10. [Google Scholar] [CrossRef]
  19. Zurn, J.D.; Rouse, M.N.; Chao, S.; Aoun, M.; Macharia, G.; Hiebert, C.W.; Pretorius, Z.A.; Bonman, J.M.; Acevedo, M. Dissection of the multigenic wheat stem rust resistance present in the Montenegrin spring wheat accession PI 362698. BMC Genom. 2018, 19, 67. [Google Scholar] [CrossRef]
  20. Zhang, J.; Zhang, X.; Tang, H.; Zhang, Q.; Hua, X.; Ma, X.; Zhu, F.; Jones, T.; Zhu, X.; Bowers, J.; et al. Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. Nat. Genet. 2018, 50, 1565–1573. [Google Scholar] [CrossRef]
  21. Yang, F.; Lang, T.; Wu, J.; Zhang, C.; Qu, H.; Pu, Z.; Yang, F.; Yu, M.; Feng, J. SNP loci identification and KASP marker development system for genetic diversity, population structure, and fingerprinting in sweetpotato (Ipomoea batatas L.). BMC Genom. 2024, 25, 1245. [Google Scholar] [CrossRef]
  22. Wang, W.; Sun, Y.; Yang, P.; Cai, X.; Yang, L.; Ma, J.; Ou, Y.; Liu, T.; Ali, I.; Liu, D.; et al. A high density SLAF-seq SNP genetic map and QTL for seed size, oil and protein content in upland cotton. BMC Genom. 2019, 20, 599. [Google Scholar] [CrossRef] [PubMed]
  23. Chen, Z.; He, Y.; Iqbal, Y.; Shi, Y.; Huang, H.; Yi, Z. Investigation of genetic relationships within three Miscanthus species using SNP markers identified with SLAF-seq. BMC Genom. 2022, 23, 43. [Google Scholar] [CrossRef]
  24. Linck, E.; Battey, C.J. Minor allele frequency thresholds strongly affect population structure inference with genomic data sets. Mol. Ecol. Resour. 2019, 19, 639–647. [Google Scholar] [CrossRef] [PubMed]
  25. Zhu, Z.; Sun, B.; Lei, J. Specific-Locus Amplified Fragment Sequencing (SLAF-Seq) as High-Throughput SNP Genotyping Methods. Methods Mol. Biol. 2021, 2264, 75–87. [Google Scholar] [CrossRef] [PubMed]
  26. Gerard, D.; Ferrão, L.F.V.; Garcia, A.A.F.; Stephens, M. Genotyping Polyploids from Messy Sequencing Data. Genetics 2018, 210, 789–807. [Google Scholar] [CrossRef]
  27. Zhang, J.; Qi, Y.; Hua, X.; Wang, Y.; Wang, B.; Qi, Y.; Huang, Y.; Yu, Z.; Gao, R.; Zhang, Y.; et al. The highly allo-autopolyploid modern sugarcane genome and very recent allopolyploidization in Saccharum. Nat. Genet. 2025, 57, 242–253. [Google Scholar] [CrossRef]
  28. Yang, Y.; Lyu, M.; Liu, J.; Wu, J.; Wang, Q.; Xie, T.; Li, H.; Chen, R.; Sun, D.; Yang, Y.; et al. Construction of an SNP fingerprinting database and population genetic analysis of 329 cauliflower cultivars. BMC Plant Biol. 2022, 22, 522. [Google Scholar] [CrossRef]
  29. Xing, X.; Hu, T.; Wang, Y.; Li, Y.; Wang, W.; Hu, H.; Wei, Q.; Yan, Y.; Gan, D.; Bao, C.; et al. Construction of SNP fingerprints and genetic diversity analysis of radish (Raphanus sativus L.). Front. Plant Sci. 2024, 15, 1329890. [Google Scholar] [CrossRef]
  30. Pan, Q.; Gao, M.; Wu, P.; Yan, J.; AbdelRahman, M.A.E. Image Classification of Wheat Rust Based on Ensemble Learning. Sensors 2022, 22, 6047. [Google Scholar] [CrossRef]
  31. Moolhuijzen, P.; Sanglard, L.; Paterson, D.J.; Gray, S.; Khambatta, K.; Hackett, M.J.; Zerihun, A.; Gibberd, M.R.; Naim, F. Spatiotemporal patterns of wheat response to Pyrenophora tritici-repentis in asymptomatic regions revealed by transcriptomic and X-ray fluorescence microscopy analyses. J. Exp. Bot. 2023, 74, 4707–4720. [Google Scholar] [CrossRef]
  32. Jamali, S.H.; Cockram, J.; Hickey, L.T. Insights into deployment of DNA markers in plant variety protection and registration. Theor. Appl. Genet. 2019, 132, 1911–1929. [Google Scholar] [CrossRef]
Figure 1. (A) Distribution of SLAF tags on genomic chromosomes; (B) Distribution of SNP markers on genomic chromosomes.
Figure 1. (A) Distribution of SLAF tags on genomic chromosomes; (B) Distribution of SNP markers on genomic chromosomes.
Agronomy 16 00343 g001
Figure 2. (A) Distribution of minor allele frequency (MAF) based on high quality SNPs; (B) Observed heterozygosity vs. expected heterozygosity based on high quality SNPs. Each blue dot represents a single locus. The red dashed line (y = x) denotes the expected relationship under Hardy-Weinberg equilibrium.
Figure 2. (A) Distribution of minor allele frequency (MAF) based on high quality SNPs; (B) Observed heterozygosity vs. expected heterozygosity based on high quality SNPs. Each blue dot represents a single locus. The red dashed line (y = x) denotes the expected relationship under Hardy-Weinberg equilibrium.
Agronomy 16 00343 g002
Figure 3. (A) Neighbor-joining (NJ) phylogenetic tree of 107 samples. Group I is shown in red, Group II is shown in green, Group III is shown in blue; (B) Principal Component Analysis (PCA) plot of 107 samples; (C) Cross-validation error plot of 107 subsamples; (D) Population structure analysis of 107 samples.
Figure 3. (A) Neighbor-joining (NJ) phylogenetic tree of 107 samples. Group I is shown in red, Group II is shown in green, Group III is shown in blue; (B) Principal Component Analysis (PCA) plot of 107 samples; (C) Cross-validation error plot of 107 subsamples; (D) Population structure analysis of 107 samples.
Agronomy 16 00343 g003
Figure 4. (A) Distribution of 151 core SNP markers on genomic chromosomes; (B) Neighbour-joining (NJ) method based on the genotyping data of 303 samples with 151 core SNP markers; (C) Neighbour-joining (NJ) based on the genotyping data of 186 segregating population samples with of 151 core SNP markers.
Figure 4. (A) Distribution of 151 core SNP markers on genomic chromosomes; (B) Neighbour-joining (NJ) method based on the genotyping data of 303 samples with 151 core SNP markers; (C) Neighbour-joining (NJ) based on the genotyping data of 186 segregating population samples with of 151 core SNP markers.
Agronomy 16 00343 g004
Figure 5. The same ordinal number in each plot indicates the same variety through three different molecular markers, while the colours on the ordinal numbers reflect the different geographical origins (A) SNP markers with S. officinarum LA-Purple as the reference genome; (B) SNP markers S. spontaneum AP85-441 as the reference genome; (C) SSR marker, constructed clustered NJ phylogenetic tree of the same variety; (D) neighbour-joining (NJ) phylogenetic tree of 107 samples constructed with high quality SNP molecular markers of S. spontaneum AP85-441 as the reference genome. Group I is shown in red, Group II is shown in green, Group III is shown in blue; (E) Principal component analysis (PCA) plot of 107 samples constructed with high-quality SNP molecular markers from S. spontaneum AP85-441 as the reference genome. Group classification criteria were classified according to the cluster analysis constructed with S. officinarum LA-Purple as the reference genome, with group I in red, group II in green, and group III in blue.
Figure 5. The same ordinal number in each plot indicates the same variety through three different molecular markers, while the colours on the ordinal numbers reflect the different geographical origins (A) SNP markers with S. officinarum LA-Purple as the reference genome; (B) SNP markers S. spontaneum AP85-441 as the reference genome; (C) SSR marker, constructed clustered NJ phylogenetic tree of the same variety; (D) neighbour-joining (NJ) phylogenetic tree of 107 samples constructed with high quality SNP molecular markers of S. spontaneum AP85-441 as the reference genome. Group I is shown in red, Group II is shown in green, Group III is shown in blue; (E) Principal component analysis (PCA) plot of 107 samples constructed with high-quality SNP molecular markers from S. spontaneum AP85-441 as the reference genome. Group classification criteria were classified according to the cluster analysis constructed with S. officinarum LA-Purple as the reference genome, with group I in red, group II in green, and group III in blue.
Agronomy 16 00343 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, W.; Wang, Y.; Yang, Z.; Zhao, J.; Huang, C.; Huang, G.; Xu, L.; Liu, J.; Zhao, Y.; Zhang, Y.; et al. Development and Application of an LDR-Based SNP Panel for High-Resolution Genotyping and Variety Identification in Sugarcane. Agronomy 2026, 16, 343. https://doi.org/10.3390/agronomy16030343

AMA Style

Zhao W, Wang Y, Yang Z, Zhao J, Huang C, Huang G, Xu L, Liu J, Zhao Y, Zhang Y, et al. Development and Application of an LDR-Based SNP Panel for High-Resolution Genotyping and Variety Identification in Sugarcane. Agronomy. 2026; 16(3):343. https://doi.org/10.3390/agronomy16030343

Chicago/Turabian Style

Zhao, Weitong, Yue Wang, Zhiwei Yang, Junjie Zhao, Chaohua Huang, Guoqiang Huang, Liangnian Xu, Jiayong Liu, Yong Zhao, Yuebin Zhang, and et al. 2026. "Development and Application of an LDR-Based SNP Panel for High-Resolution Genotyping and Variety Identification in Sugarcane" Agronomy 16, no. 3: 343. https://doi.org/10.3390/agronomy16030343

APA Style

Zhao, W., Wang, Y., Yang, Z., Zhao, J., Huang, C., Huang, G., Xu, L., Liu, J., Zhao, Y., Zhang, Y., Deng, Z., & Zhao, X. (2026). Development and Application of an LDR-Based SNP Panel for High-Resolution Genotyping and Variety Identification in Sugarcane. Agronomy, 16(3), 343. https://doi.org/10.3390/agronomy16030343

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop