Next Article in Journal
Distant Hybridization of Kazakh Wheat Varieties with Wild Aegilops Species: Cytogenetic Compatibility, Fertilization Dynamics, and Breeding Implications
Previous Article in Journal
The Impact of Smoothing Techniques on Vegetation Phenology Extraction: A Case Study of Inner Mongolia Grasslands
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Chloroplast Genome-Based Insights into Variety Identification in Toona sinensis

1
School of Food and Biological Engineering, Chengdu University, Chengdu 610106, China
2
Sichuan Wolong National Natural Reserve Administration Bureau, Wenchuan 623006, China
3
Institute for Advanced Study, Chengdu University, No. 2025 Chengluo Road, Chengdu 610106, China
4
Sichuan-Xizang Medicinal Resource Breeding and Standardization Team, Engineering Research Center of Sichuan-Xizang Traditional Medicinal Plant, Chengdu University, Chengdu 610106, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Agronomy 2026, 16(1), 127; https://doi.org/10.3390/agronomy16010127
Submission received: 12 November 2025 / Revised: 30 December 2025 / Accepted: 31 December 2025 / Published: 4 January 2026
(This article belongs to the Section Crop Breeding and Genetics)

Abstract

Modern sequencing technologies have transformed the identification of medicinal plant species and varieties, overcoming the limitations of traditional approaches. To address the challenge of discriminating Toona sinensis varieties, we sequenced and compared 15 complete chloroplast genomes from five varieties in northern China. Although these genomes exhibited a highly conserved structure, we identified eight variety-specific simple sequence repeats (SSRs), two unique tandem repeats, and several hypervariable regions with elevated nucleotide diversity. Phylogenetic analysis demonstrated that whole chloroplast genomes provided the highest resolution for variety identification, outperforming conventional barcodes. Furthermore, we developed 13 specific primer pairs targeting variable regions, and PCR validation confirmed their reliable amplification across varieties. In addition, sequence-level validation by Sanger sequencing of representative SSR and tandem repeat markers revealed stable, variety-specific repeat copy number differences. These results demonstrate that the identified chloroplast markers can effectively discriminate closely related T. sinensis varieties. This study confirms that despite overall conservation, the T. sinensis plastome contains sufficient variation for reliable identification, providing a robust framework for future germplasm conservation and molecular breeding.

1. Introduction

Toona sinensis is a perennial deciduous tree native to China, known for its unique flavor, nutrition and medicinal values [1]. The tree has a long history of use as a food source and a traditional herbal remedy. Due to the high similarity in morphological traits among T. sinensis varieties, accurate identification based solely on physical characteristics is challenging and often unreliable, underscoring the need for the use of effective molecular tools to distinguish closely related varieties. In the past decade, advances in next-generation sequencing technologies have significantly reduced the cost and improved the quality of chloroplast genome sequencing [2]. This is because compared to the nuclear genome, the chloroplast genome is relatively small, structurally conserved, and contains a highly conserved set of coding genes, making it a powerful resource in plant phylogenetic studies and species identification [3,4,5]. The use of chloroplast genome data has also been successfully and accurately used to discriminate Toona ciliata, Ardisia, and Vitis at both species and variety levels [6,7,8].
Despite these advancements, research on T. sinensis chloroplast genomes remains limited. The first complete chloroplast genome of T. sinensis was reported by Liu et al. [9]. Subsequently, Li et al. [10] conducted a phylogenetic analysis of the genus Toona based on whole plastomes. However, studies on the rapid and precise identification of different T. sinensis varieties using chloroplast genome markers are still lacking. Therefore, there is an urgent need to develop reliable chloroplast genome-based molecular markers for precise variety identification to support the germplasm conservation and breeding efforts in T. sinensis.
In recent years, the increasing availability of complete chloroplast genomes has greatly promoted the development of plastome-based tools for high-resolution molecular identification. Whole plastomes offer richer genetic signals than single-locus barcodes because they contain abundant single-nucleotide polymorphisms, indels, repeat-related variations, and divergence hotspots, enabling discrimination even among closely related cultivars [3]. Repetitive elements such as simple sequence repeats (SSRs), long repeats, and tandem repeats have proven particularly informative for cultivar or variety differentiation in many medicinal and horticultural plants, including Chrysanthemum, Paeonia, Callitropsis, and Aglaonema [11,12,13,14]. Likewise, hypervariable plastome regions such as ycf1, ndhF, and trnT-trnF have shown strong discriminatory power across woody and medicinal taxa [15,16,17,18]. These findings collectively demonstrate that plastome-derived markers can provide reliable resolution at the variety level, highlighting the potential value of a systematic plastome analysis for T. sinensis, which has not yet been undertaken.
Therefore, we sequenced, assembled, and annotated the complete chloroplast genomes of 15 T. sinensis accessions representing five cultivated varieties from northern China. As the first comprehensive comparative plastomics study focused on varietal discrimination within this economically important species, the objectives of this study included analyzing and comparing repetitive elements, such as SSRs and long repeats, to identify novel variety-specific molecular markers in T. sinensis and detecting hypervariable regions with elevated nucleotide diversity through comparative genomics, and validating representative repeat-based markers at the sequence level to assess their discriminatory capacity among closely related varieties. The study also performed a comprehensive phylogenetic analysis using whole chloroplast genomes and selected variable regions to assess their effectiveness in resolving relationships among T. sinensis varieties, thereby establishing a reliable plastome-based framework for variety identification and supporting germplasm conservation and breeding applications.

2. Materials and Methods

2.1. Plant Materials and DNA Sequencing

Fresh young leaves were collected from five distinct T. sinensis varieties, including T. sinensis var. Jiaozuohong (JZ), var. Linqu (LQ), var. Woyunpu (LW), var. Qingzhouhong (QZ), and var. Hebeihong (HB), cultivated across 15 geographically diverse sites located in Henan, Shandong and Hebei provinces in Northern China (Table S1). All plant materials were morphologically identified by Hongqiang Lin (13348986271@163.com), a plant taxonomist and co-author of this study. Total genomic DNA was extracted using the cetyltrimethylammonium bromide method [19]. Paired-end sequencing libraries with an average insert size of 350 bp were prepared with the NEBNext Ultra DNA Library Prep Kit (NEB, Ipswich, MA, USA) and sequenced on an Illumina HiSeq 2500 platform (Illumina Inc., San Diego, CA, USA).

2.2. Chloroplast Genome Assembly and Annotation

The fastp v0.24.0 [20] was used to eliminate low-quality and adapter-containing reads. The quality of the filtered reads was subsequently assessed using FastQC v0.12.0 [21]. The high-quality reads obtained were then de novo assembled into complete chloroplast genomes using NOVOPlasty v4.2.1 [22], with the published T. sinensis chloroplast genome sequence (OK572965.1) as the reference and seed input. The resulting assemblies were annotated using Plann v1.1.2 [23], and then manually curated and validated to ensure accuracy. Consequently, physical maps of the assembled genomes were generated using OGDRAW v1.3.1 [24] to depict the spatial organization of inverted repeat (IR) and single-copy (SC) regions, and annotated genes in circular configuration. Additionally, to assess sequence-level conservation, we analyzed relative synonymous codon usage (RSCU) with CodonW v1.4.2 [25] and predicted RNA editing sites in silico using PREPACT3 v3.12.0 [26].

2.3. Repeat Sequence Analysis

Chloroplast SSRs exhibit high intraspecific variability, making them promising candidates for molecular marker development [27]. Therefore, six SSR motif types, including mono-, di-, tri-, tetra-, penta-, and hexanucleotide repeats, were characterized using MISA v.1.01 [28] with minimum repeat thresholds of 10, 5, 4, 3, 3, and 3 units, respectively. Two adjacent SSRs with an inter-locus distance shorter than 100 bp were classified as compound SSRs. To ensure marker reliability, only SSR loci that were identical across all three biological replicates of a given variety at the consensus sequence level were retained as candidate variety-specific markers; inconsistent loci were excluded. This filtering strategy was applied to avoid unstable or assembly-dependent SSR loci and was not intended to assess intra-individual heteroplasmy. To expand the repertoire of molecular markers, we analyzed the long repetitive sequences, including forward, reverse, complement, and palindromic repeats, using REPuter v2.74 [29], with 30 bp as the minimum repeat length and three as the Hamming distance. We also identified the tandem repeats using Tandem Repeats Finder v4.09 [30] with default parameters.

2.4. Chloroplast Genome Comparison and Sequence Divergence Analysis

Structural variations (SVs) in the IR regions of chloroplast genomes are frequently observed, even among congeneric species or different cultivars within the same species. These variations can lead to variations in the size of the chloroplast genome, gene duplication/reduction events, and the generation of pseudogenes [31]. To investigate SVs in IR regions across T. sinensis varieties, we performed a comparative analysis of IR/SC boundary shifts and their adjacent genes using CPJSdraw v1.0 [32]. Gene orders and rearrangements were identified through whole-genome alignments implemented in Geneious Prime v9.0.2 [33], while sequence comparisons of the 15 T. sinensis plastomes were performed using Proksee [34] for circular genome visualization and mVISTA [35] (Shuffle-LAGAN mode) for linear alignment visualization. Nucleotide diversity (Pi) across chloroplast genomes was calculated using DnaSP v5.0 [36], with a window size of 600 bp and step size of 200 bp.

2.5. Phylogenetic Tree Construction

Three types of datasets were employed in phylogenetic analysis to evaluate their effectiveness in T. sinensis variety identification. These included the conventional chloroplast DNA barcoding using matK, rbcL, trnH-psbA, and their combinations, hypervariable regions identified through Pi value distributions, with a sliding window average Pi > 0.001, and complete chloroplast genome sequences. For each dataset, multiple sequence alignments were performed by MAFFT-LINSI v7.313 [37], while maximum likelihood (ML) trees were constructed using RAxML v8.2.11 [38] under the GTRGAMMA model with 1000 standard bootstrap replicates. Two T. ciliata var. henryi accessions, SM1 (OP373442) and SM2 (OP373441), were used to root the trees as outgroups.

2.6. Development and Validation of Molecular Markers for Species Discrimination

Based on comparative analysis of the chloroplast genomes, molecular markers targeting candidate variety-specific regions, including SSRs, tandem repeats, and other hypervariable loci, were developed for T. sinensis. In total, 13 primer pairs were designed to amplify diagnostic regions identified from plastome comparisons.
The PCR reactions were executed in a total volume of 10 µL with 5 µL 2 × PCR Mix, 0.5 µL each of the sense and antisense primers, 1 µL genomic DNA, and 3 µL ddH2O. Thermal cycling consisted of 94 °C cycles of for 4 min, followed by 35 cycles of 94 °C for 30 s, 52–58 °C for 30 s and final 72 C for 7 min. PCR products were visualized on 2% agarose gels after electrophoresis.
To validate the discriminatory capacity of the molecular markers at high resolution, a subset of four representative loci was selected for sequencing-based analysis. This subset included two SSR loci and two tandem repeat loci and was chosen to represent the major types of repeat-based markers identified in the chloroplast genomes. PCR products corresponding to these loci were purified and subjected to bidirectional Sanger sequencing using the same primers as those employed for PCR amplification. Sequencing was performed using BigDye Terminator chemistry on an ABI 3730XL DNA Analyzer (Applied Biosystems, Foster City, CA, USA).
Sequencing chromatograms were inspected for quality, and high-quality sequences were aligned to identify repeat motifs and to determine repeat copy number variation at single-base resolution. This sequence-level analysis confirmed stable length polymorphisms at representative SSR and tandem repeat loci among different T. sinensis varieties, providing high-resolution evidence for the discriminatory effectiveness of the developed molecular markers.

2.7. Read-Level Heteroplasmy Detection

To assess potential intra-individual heteroplasmy at candidate marker regions, a read-level variant calling analysis was performed. Clean Illumina reads from all 15 accessions were mapped to their corresponding representative chloroplast genomes using BWA-MEM v0.7.17 [39]. The resulting BAM files were sorted and indexed using SAMtools v1.3.1 [40]. Low-frequency single-nucleotide polymorphisms (SNPs) and small indels were identified using LoFreq v2.1.5 [41], which is specifically designed to detect variants at low allele frequencies. Allele frequency (AF) and read depth (DP) information were extracted from the resulting VCF files, and variants located within or adjacent (±10 bp) to candidate marker regions were examined to evaluate the presence and extent of low-frequency heteroplasmy.

2.8. Statistical Analysis

Three independent biological replicates were analyzed per variety to assess intra-varietal consistency. The overall analytical approach was descriptive and comparative, given the high conservation of chloroplast genomes at the varietal level. Formal hypothesis testing for comparing means (e.g., ANOVA) was not performed; variety discrimination relied instead on the presence/absence of conserved diagnostic markers, phylogenetic topology, and nucleotide diversity patterns. All statistical computing and visualization were conducted using Python v3.9, R v4.2.2, and GraphPad Prism v9.5.

3. Results

3.1. General Features of the T. sinensis Chloroplast Genomes

The Illumina sequencing generated 2.43–10.70 Gb of raw data across 15 T. sinensis accessions (Table S1). Following the removal of low-quality and adapter-containing reads, de novo assembly produced chloroplast genomes ranging from 159,252 to 159,311 bp, with a conserved quadripartite architecture, including a large SC (LSC) region ranging from 86,890 to 87,007 bp, a small SC (SSC) region occurring between 18,332 and 18,346 bp, and two IR regions ranging from 26,981 to 27,019 bp (Figure 1; Table 1). The 15 chloroplast genomes exhibited conserved GC content at approximately 37.9%, while the IR regions showed significantly elevated GC levels of ~42.8% compared to both the LSC and SSC regions, with ~36.0% and ~32.2%, respectively (Table 1).
All the chloroplast genomes also harbored 129 genes with identical gene order across the 15 accessions (Figure S1) and comprised 84 PCGs, 37 transfer RNA genes (tRNAs), and eight ribosomal RNA genes (rRNAs). Among these genes, eighteen were duplicated in the IR regions, including seven PCGs (rps7, rps12, rpl23, rpl2, rps19, ycf2, and ndhB), seven tRNAs (trnN-GTT, trnR-ACG, trnA-TGC, trnI-GAT, trnV-GAC, trnL-CAA, and trnM-CAT), and four rRNAs (rrn16S, rrn23S, rrn4.5S and rrn5S). There were eighteen intron-containing genes, with the majority (83.33%) harboring a single intron, while three genes, including rps12, ycf3, and clpP, had two introns each (Table S2). This conserved gene content and order across all accessions underscores the high stability of the chloroplast genome in T. sinensis. Further supporting this high conservation, analyses of RSCU (Figure S2A) and in silico predicted RNA editing sites (Figure S2B) revealed identical patterns across all T. sinensis varieties. Consequently, reliable varietal discrimination must rely on sequence-level polymorphisms rather than structural variations.

3.2. Identification and Features of Chloroplast Repeats

The 15 chloroplast genomes showed minimal variation in SSR counts, ranging from 89 in LW accessions to 94 in QZ and LQ accessions (Figure 2A). Compound SSRs constituted only 8.70–9.57% of all identified SSRs, whereas simple repeat motifs dominated the SSR types, displaying a broad size range from 10 bp to 106 bp. Mononucleotide repeats were the most prevalent, ranging from 69.57 to 70.21%, followed by tetranucleotide with a range between 10.64 and 11.24% and dinucleotide motifs between 9.57 and 10.11%. However, tri-, penta-, and hexanucleotide repeats were rare and collectively less than 10%. All mononucleotide SSRs were exclusively composed of A or T bases, while other SSR types exhibited higher motif diversity. For example, dinucleotide repeats contained AG/CT and AT/AT motifs, while tetranucleotide repeats included motifs such as AAAG/CTTT, AAAT/ATTT, AACG/CGTT, ACAT/ATGT, and AGAT/ATCT (Figure S3). Despite the overall conservation in SSR abundance and type, we identified eight variety-specific SSRs, including a compound SSR unique to JZ accessions, and seven LW-specific SSRs comprising three mononucleotide, two hexanucleotide, and two compound motifs (Figure 2B; Table S4). These variety-specific loci represent high-value candidates for developing diagnostic molecular markers.
Analysis of long repetitive sequences revealed conserved patterns across the 15 chloroplast genomes. Thus, we identified one complement repeat with 30–31 bp, 10–11 forward repeats with 30–46 bp, and 18–19 palindromic repeats with 30–58 bp per genome, while reverse repeats were absent (Figure 2C). Although one complement and three forward repeats were shared among multiple varieties, none of the repeats were unique to any single variety (Figure 2D). This complete lack of variety specificity indicates that long repeats are not suitable for intraspecific discrimination in T. sinensis.
In addition, each chloroplast genome contained 20–24 tandem repeats ranging from 26 to 89 bp, with 95.28% having more than 2.0 copies (Figure 2E). Intriguingly, two distinct tandem repeats of 50 bp and 30 bp were uniquely identified in JZ accessions (Figure 2F; Table S5). The presence of these JZ-specific tandem repeats, together with the JZ-specific compound SSR mentioned above, provides a clear molecular signature for distinguishing the JZ variety.
Collectively, the repeat element analysis reveals a landscape of broad conservation punctuated by discrete, variety-specific variations in SSRs and tandem repeats. These identified variations form a concrete genomic foundation for precise variety identification of T. sinensis.

3.3. Structural Dynamics and Hypervariable Sequence Landscapes

The 15 chloroplast genomes also exhibited complete structural conservation, including all IR/SC boundary regions. To establish a stable genomic backdrop essential for reliable varietal discrimination, we first verified the absence of large-scale structural changes. All IR-associated genes, including ndhF, ycf1, rps3, trnH and rpl22, maintained invariant configurations, confirming perfect preservation of IR architecture among the T. sinensis varieties used (Figure S4). Comparative analysis by Proksee and mVISTA further demonstrated complete conservation in both genome structure and sequence across all T. sinensis accessions (Figures S5 and S6).
Having confirmed overall structural stability, we then focused on detecting subtle sequence-level variations. To assess the potential sequence divergence, we analyzed the Pi distributions across 600 bp sliding windows, with values ranging from 0 to 0.00171 and a mean of 0.00012 (Figure 3). Regional variation in Pi followed the pattern SSC (0.00030) > LSC (0.00014) > IR (0.000032), confirming the structural conservation of the IR regions. Based on the Pi distribution, we identified three hypervariable regions with Pi > 0.001, including two regions located in the SSC region—ycf1 (Pi = 0.00171) and ndhF (Pi = 0.00143)—and one in the LSC region, trnT-TGTtrnF-GAA (Pi value = 0.00143) (Figure 3). These hypervariable regions, exhibiting the highest nucleotide diversity across the plastome, were thus identified as the most promising genomic targets for developing markers to distinguish T. sinensis varieties.

3.4. Phylogenetic Analysis

The discriminatory efficacy of the conventional chloroplast DNA barcodes using matK, rbcL, and trnH-psbA was evaluated phylogenetically to assess their capacity to differentiate T. sinensis varieties. All three markers demonstrated consistently low nucleotide diversity with Pi = 0 for rbcL and Pi < 0.0005 for matK and trnH-psbA (Figure 3). Nevertheless, matK successfully discriminated JZ accessions from other varieties, rbcL and trnH-psbA exhibited no discriminative capacity, while a multilocus barcode combining all three markers differentiated among JZ, LW, and other accessions (Figure 4). Further, phylogenetic analysis based on three hypervariable regions of ycf1, ndhF and trnT-TGTtrnF-GAA with Pi > 0.001 demonstrated that ndhF and trnT-TGTtrnF-GAA successfully distinguished JZ accessions from other varieties, ycf1 showed no discriminative capacity, while the combination of all three hypervariable markers differentiated LW accessions from other varieties (Figure 5). Consequently, ML analysis based on complete chloroplast genome sequences revealed that the entire chloroplast genome exhibits superior discriminative capacity compared to both conventional and hypervariable markers. The complete chloroplast genome could differentiate HB, LW, JZ, and two additional accessions (QZ and LQ) with strong bootstrap support >80%, although it showed limited resolution between QZ and LQ (Figure 6).

3.5. Molecular Marker Development and Sequence-Level Validation

To assess the amplifiability of the designed primer pairs (Table S6), PCR amplification was performed across all 15 T. sinensis accessions. All 13 primer pairs successfully produced clear amplicons of the expected size, indicating that the targeted chloroplast regions are accessible and can be consistently amplified across the tested germplasm (Figure S7).
To evaluate the discriminatory capacity of the developed molecular markers, four representative loci, including two SSR loci and two tandem repeat loci, were subjected to Sanger sequencing across multiple individuals of each variety. For the SSR markers, the repeat motifs (TTAGGA)n and (TCCTAA)n showed consistent copy number variation among varieties (Figure 7A,B and Figures S8 and S9). In the LW variety, all three tested individuals exhibited three repeat units at both loci, whereas only two repeat units were detected in all other examined varieties, demonstrating the LW-specific nature of these SSR markers.
For the tandem repeat markers, distinct variety-specific patterns were also observed. The tandem repeat sequence TAAATTCTTTATTCAATTATAAAT was detected with two repeat units in all three individuals of the JZ variety, while no repeat units were observed in other varieties (Figure 7C and Figure S10). Similarly, the tandem repeat AATATAGAATAGGAA exhibited two repeat units exclusively in the JZ variety and was absent from the remaining varieties (Figure 7D and Figure S11). These sequence-level differences provide direct evidence that both SSR- and tandem repeat-based markers can effectively discriminate closely related T. sinensis varieties.

3.6. Read-Level Heteroplasmy at Candidate Marker Regions

To further evaluate the intra-individual stability of the candidate chloroplast markers identified above, we performed read-level variant calling to detect potential low-frequency heteroplasmy. Across all 15 accessions, low-frequency SNPs were detected at several candidate marker regions, including SSR loci and hypervariable regions (Table S7). The detected variants exhibited allele frequencies ranging from approximately 1.0% to 2.6%, despite consistently high sequencing depths (3000–8000×). These variants were located either within marker regions or within ±10 bp flanking regions of SSR loci. No variant exceeded 3% allele frequency in any accession, and none altered the consensus sequence or repeat structure of the corresponding marker loci.

4. Discussion

4.1. Structural Conservation and Intraspecific Variation in the T. sinensis Chloroplast Genomes

Chloroplast genomes in higher plants are generally characterized by a highly conserved structure and sequence composition [42]. These characteristics are evident in different species within the same genus, including Ardisia, Vitis, Aconitum, Pulsatilla [7,8,43,44] and among different varieties within a single species, such as T. ciliata, Chrysanthemum morifolium, Malus domestica (Fuji), and Scutellaria baicalensis [6,11,45,46]. The plastomes of these plants also exhibit a stable quadripartite structure, consistent gene content and order, conserved IR boundaries, and uniform codon usage. Our comparative analysis supports this general pattern, revealing highly conserved genome architecture, gene order, codon usage, IR/SC junctions, and minimal variation in genome size of the 15 T. sinensis accessions. However, subtle but informative intraspecific differences were detected, as has been observed in other species. Hence, the variation 15 T. sinensis accessions was primarily reflected in the SSR types and counts, the presence of variety-specific repeat motifs, including unique SSRs in the JZ and LW accessions, and the distribution of long and tandem repeats. Moreover, sliding window analysis of Pi revealed several hypervariable regions, particularly in the SSC and LSC regions, including ndhF, ycf1, and trnT-trnF. Similar patterns have been reported in S. baicalensis, where sequence divergence in regions like matK-rps16 and petA-psbJ, as well as SSR and long repeat differences, were found among its cultivars [46]. Likewise, a unique hexanucleotide SSR was only identified in T. ciliata var. pubescens but not in the other three varieties assessed [6]. Therefore, our findings demonstrate that the subtle internal variations in the T. sinensis chloroplast genome, particularly within repetitive elements and localized divergence hotspots, constitute the precise genomic resources enabling reliable variety identification, directly supporting the premise of our study.

4.2. The Utility of Repetitive Elements in T. sinensis Variety Identification

Chloroplast SSRs are primarily generated through DNA replication slippage, where the DNA polymerase temporarily dissociates and rebinds out of register, leading to the expansion or contraction of short repeat motifs [47]. This mechanism, combined with limited error correction in the plastid genome, contributes to the high variability of SSRs, particularly in A/T-rich non-coding regions. In our study, SSRs showed a modest variation in total counts among the 15 T. sinensis accessions, though eight variety-specific SSRs were identified, mostly in the JZ and LW accessions, demonstrating their potential as variety-level molecular markers. The high mutability, simple detection via PCR, and codominant inheritance of SSRs make them suitable for variety identification.
Longer tandem repeats, in contrast, arise through more complex mechanisms, including unequal homologous recombination, template switching during replication, or DNA secondary structure-induced polymerase stalling [48]. Although typically more conserved than SSRs, we detected two unique tandem repeats in JZ accessions, which may reflect lineage-specific genome events and may provide additional discriminatory value to complement SSRs in building multilocus identification systems in T. sinensis. Other types of long repeats, including forward, palindromic, complement, and reverse repeats, appear to play a predominantly structural role in the stability of the plastid genome. These repeats originate from recombination or sequence slippage but tend to be under selective constraint once established [49]. In T. sinensis, these repeats were remarkably conserved across all accessions, with no variety-specific motifs observed, indicating their limited utility for intraspecies discrimination, although they may be relevant for structural and evolutionary studies.
Thus, SSRs and specific tandem repeats offer the most promising repetitive element-based markers for T. sinensis variety identification. SSRs have been widely and successfully applied in variety identification across diverse species, such as C. morifolium, Paeonia suffruticosa, Callitropsis funebris, and Aglaonema commutatum [11,12,13,14]. In contrast, although tandem repeats demonstrated variety-specific patterns in this study, their application in chloroplast-based variety identification remains rarely reported in the previous literature. Other types of long repeats exhibited high sequence conservation across T. sinensis accessions and therefore lack discriminatory power, limiting their practical value as molecular markers. These findings collectively underscore the importance of repeat type selection when developing chloroplast-based genomic tools for intraspecies differentiation. Read-level variant calling further revealed only low-frequency heteroplasmic variants (<3%) at SSR loci and their immediate flanking regions, despite high sequencing depth. Importantly, these variants did not alter the dominant repeat structures or consensus sequences of the identified SSR markers, supporting their robustness for variety identification.

4.3. The Role of Hypervariable Regions and Conventional Markers in T. sinensis Variety Identification

Hypervariable regions in chloroplast genomes arise from a combination of molecular mechanisms, including DNA replication slippage, local mutation hotspots, weak selective constraints, and occasional recombination events, particularly near IR/SC junctions [50]. These regions typically accumulate more substitutions and indels, especially within non-coding intergenic spacers or near genes such as ycf1 and ndhF, which are frequently reported as plastome divergence hotspots across diverse plant taxa [15,16,17,18]. In the current study, a sliding window analysis of nucleotide diversity revealed three hypervariable regions with Pi > 0.001, including ycf1 and ndhF in the SSC region, and the trnT-TGTtrnF-GAA intergenic spacer in the LSC region. These regions exhibited higher mutation rates than the surrounding genomic regions and showed potential for distinguishing specific T. sinensis accessions. Phylogenetic analysis confirmed that ndhF and trnT-TGTtrnF-GAA provided effective discrimination of JZ accessions, while ycf1, despite its high variability, did not significantly contribute to varietal resolution. The differential utility of these regions may reflect their underlying mutational mechanisms and sequence contexts.
Conventional chloroplast DNA barcodes such as matK, rbcL, and trnH-psbA have been widely applied for species and variety identification in plants [51]. In some species, these markers are located within or adjacent to hypervariable regions and have demonstrated high discriminative power [52,53,54]. However, in T. sinensis, all three markers exhibited extremely low nucleotide diversity, with rbcL showing no variation at all. Their capacity to resolve varieties was also limited, with only matK showing a weak discriminative power in JZ accessions. However, the phylogenetic analysis achieved moderate variety-level resolution when matK was used in combination with the other two barcodes. These findings underscore the importance of identifying species-specific hypervariable regions for marker development rather than relying solely on conventional barcodes. In T. sinensis, customized hypervariable regions derived from whole plastome analysis, particularly those influenced by replication slippage or relaxed selection, demonstrate greater potential for high-resolution variety discrimination than traditional markers.

4.4. The Utility of Whole Chloroplast Genomes in T. sinensis Variety Discrimination

The complete chloroplast genome has become an increasingly valuable tool for plant variety identification [5], offering high-resolution discrimination through comprehensive genetic information, including SNPs, InDels, repeat variations, and localized divergence. In this study, phylogenetic analysis based on whole plastome sequences provided the highest resolving power among all datasets, successfully distinguishing major T. sinensis accessions such as HB, LW, and JZ with strong bootstrap support.
Compared with conventional barcodes and partial hypervariable regions, whole plastome data provide broader genomic context and enable the development of diverse marker types. Due to its typically maternal inheritance and absence of recombination, the chloroplast genome offers a stable genetic backbone for inferring varietal relationships. However, chloroplast genomes tend to evolve more slowly than nuclear genomes, and in species with low intra-plastome variability like T. sinensis, the amount of detectable polymorphism may still be limited. Moreover, whole-genome sequencing requires higher technical expertise and data-processing resources. Despite these challenges, full plastome-based approaches have demonstrated high discriminative power in a variety of crops and ornamentals. For instance, plastome phylogenies successfully differentiated cultivars in C. morifolium, S. baicalensis, and Chenopodium album [11,46,55]. These studies illustrate that the analysis of even the modest plastid variation across the entire genome can yield sufficient resolution to separate closely related varieties, particularly in clonally propagated or morphologically similar groups. Therefore, although the plastid genome of T. sinensis exhibits relatively low variation, the whole chloroplast genome analysis still provides the most comprehensive and effective framework for variety identification. It enables precise phylogenetic reconstruction and offers a rich source of intraspecific genetic features that can support reliable variety-level discrimination.

4.5. Limitations, Future Perspectives, and Applications

While this study provides a chloroplast genomic basis for T. sinensis variety identification, the current germplasm is limited to northern China. Future work should incorporate wider geographical samples to validate marker universality and test their robustness in applied, mixed-sample contexts.
The variety-specific SSRs and hypervariable regions identified here are ideal targets for developing practical PCR-based assays. Such tools would directly support germplasm authentication, product verification, and breeding programs. Further integration of chloroplast and nuclear genomic data could offer deeper insights into varietal relationships.
Ultimately, this work not only advances molecular systematics but also delivers applicable resources for the sustainable management and utilization of T. sinensis.

5. Conclusions

This study provides a comparative analysis of 15 chloroplast genomes from five T. sinensis varieties collected across northern China. While their genomes share a highly conserved quadripartite structure and gene order, we identified subtle but informative variations, including eight variety-specific SSRs, two JZ-specific tandem repeats, and three hypervariable regions with elevated nucleotide diversity. The phylogenetic analysis also showed the highest resolution offered by complete chloroplast genomes for distinguishing varieties, thereby outperforming conventional barcodes and hypervariable regions. These results collectively achieve our primary research objective of establishing a reliable, chloroplast genome-based framework for T. sinensis variety identification. These findings demonstrate that, despite being generally conserved, the T. sinensis plastome contains sufficient variation for effective variety identification. The immediate benefit of this research is the provision of specific, validated molecular markers and genomic regions that can be directly used for germplasm authentication, breeding support, and product verification. Therefore, the approach used in this study, including genome assembly, repeat analysis, and multi-scale phylogenetics, provides a practical framework for future chloroplast genome-based studies in T. sinensis and related species. We aspire that this work will facilitate the development of cost-effective diagnostic kits and encourage the integration of chloroplast and nuclear genomic data to further resolve the complex varietal relationships within this economically important species.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/agronomy16010127/s1. Supplementary Figure S1: Comparison of the plastomes structure of 15 T. sinensis using Geneious Prime; Figure S2: Codon usage bias and RNA editing sites in T. sinensis chloroplast genomes; Figure S3: Numbers of different repeat units of sequence repeats (SSRs) among 15 T. sinensis; Figure S4: Analysis of IR region boundaries in the plastomes of 15 T. sinensis; Figure S5: Comparison of the plastomes of 15 T. sinensis in Proksee; Figure S6: Sequence identity plots of the 15 T. sinensis plastomes; Figure S7: Agarose gel electrophoresis validating the amplification of target regions using designed primers; Figure S8: Sanger sequencing validation of the SSR marker (TTAGGA)n in 15 T. sinensis samples; Figure S9: Sanger sequencing validation of the SSR marker (TCCTAA)n in 15 T. sinensis samples; Figure S10: Sanger sequencing validation of the TR marker (TAAATTCTTTATTCAATTATAAAT)n in 15 T. sinensis samples; Figure S11: Sanger sequencing validation of the TR marker (AATATAGAATAGGAA)n in 15 T. sinensis samples. Table S1: Information on the collection and plastomes assembly of 15 T. sinensis; Table S2: Gene annotation information for the plastomes of 15 T. sinensis; Table S3: Number of amino acids in the plastomes of 15 T. sinensis; Table S4: Specific SSRs in the plastomes of 15 T. sinensis; Table S5: Specific tandem repeats in the plastomes of 15 T. sinensis; Table S6: Primers for the identification of different varieties of T. sinensis; Table S7: Low-frequency variants detected at candidate chloroplast marker regions using LoFreq.

Author Contributions

M.W. and R.L. designed the study. S.Z. and P.D. analyzed the data. M.W., R.L., S.Z. and H.L. drafted the manuscript. M.W. and R.L. revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Sichuan Provincial Key Laboratory for Development and Utilization of Characteristic Horticultural Biological Resources (grant no. 2023TSYY-03), Chengdu Normal University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw sequencing reads are available in the NCBI Sequence Read Archive under BioProject PRJNA1378337 (accessions SRR36386315–SRR36386329). The chloroplast genomes and annotations generated in this study are available on FigShare (https://doi.org/10.6084/m9.figshare.29194625.v1, assessed on 30 May 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SSRSimple sequence repeats
RSCURelative synonymous codon usage
PCGsProtein-coding genes
IRinverted repeat
SSCSmall single-copy
LSCLarge single-copy
MLMaximum likelihood
PiNucleotide diversity
SVsStructural variations

References

  1. Zhao, Q.; Zhong, X.L.; Zhu, S.H.; Wang, K.; Tan, G.F.; Meng, P.H.; Zhang, J. Research advances in Toona sinensis, a traditional Chinese medicinal plant and popular vegetable in China. Diversity 2022, 14, 572. [Google Scholar] [CrossRef]
  2. Dobrogojski, J.; Adamiec, M.; Luciński, R. The chloroplast genome: A review. Acta Physiol. Plant. 2020, 42, 98. [Google Scholar] [CrossRef]
  3. Li, X.; Yang, Y.; Henry, R.J.; Rossetto, M.; Wang, Y.; Chen, S. Plant DNA barcoding: From gene to genome. Biol. Rev. 2015, 90, 157–166. [Google Scholar] [CrossRef]
  4. Nock, C.J.; Waters, D.L.; Edwards, M.A.; Bowen, S.G.; Rice, N.; Cordeiro, G.M.; Henry, R.J. Chloroplast genome sequences from total DNA for plant identification. Plant Biotechnol. J. 2010, 9, 328–333. [Google Scholar] [CrossRef] [PubMed]
  5. Wang, M.; Lin, H.; Lin, H.; Du, P.; Zhang, S. From species to varieties: How modern sequencing technologies are shaping medicinal plant identification. Genes 2024, 16, 16. [Google Scholar] [CrossRef] [PubMed]
  6. Xiao, Y.; Wang, X.; He, Z.H.; Lv, Y.W.; Zhang, C.H.; Hu, X.S. Assessing the phylogenetic relationship among varieties of Toona ciliata (Meliaceae) in sympatry with chloroplast genomes. Ecol. Evol. 2023, 13, e10828. [Google Scholar] [CrossRef]
  7. Yuan, L.; Ni, Y.; Chen, H.; Li, J.; Lu, Q.; Wang, L.; Zhang, X.; Yue, J.; Yang, H.; Liu, C. Comparative chloroplast genomes study of five officinal Ardisia Species: Unraveling interspecific diversity and evolutionary insights in Ardisia. Gene 2024, 912, 148349. [Google Scholar] [CrossRef]
  8. Zhang, L.; Song, Y.; Li, J.; Liu, J.; Zhang, Z.; Xu, Y.; Fan, D.; Liu, M.; Ren, Y.; He, J.; et al. Identification, comparative and phylogenetic analysis of eight Vitis species based on the chloroplast genome revealed their contribution to heat tolerance in grapevines. Sci. Hortic. 2024, 327, 112833. [Google Scholar] [CrossRef]
  9. Liu, B.; Zhang, J.; Shi, Y. Complete chloroplast genome of Toona sinensis (Meliaceae), a goluptious ‘tree vegetables’. Mitochondrial DNA B Resour. 2019, 4, 3025–3026. [Google Scholar] [CrossRef]
  10. Li, Y.; Gu, M.; Lin, J.; Jiang, H.; Xiao, X.; Zhou, W. Comparative analysis of the complete chloroplast genomes in Toona sinensis and Toona ciliata: Phylogenetic relationship of Toona. Res. Sq. 2022, preprint. [Google Scholar] [CrossRef]
  11. Duan, Y.; Wang, Y.; Ding, W.; Wang, C.; Meng, L.; Meng, J.; Chen, N.; Liu, Y.; Xing, S. Comparative and phylogenetic analysis of the chloroplast genomes of four commonly used medicinal cultivars of Chrysanthemums morifolium. BMC Plant Biol. 2024, 24, 992. [Google Scholar] [CrossRef]
  12. Guo, Q.; Xue, X.; Wang, D.; Zhang, L.; Liu, W.; Wang, E.; Cui, X.; Hou, X. Genetic diversity and population genetic structure of Paeonia suffruticosa by chloroplast DNA simple sequence repeats (cpSSRs). Hortic Plant J. 2024, 11, 367–376. [Google Scholar] [CrossRef]
  13. Ping, J.; Feng, P.; Li, J.; Zhang, R.; Su, Y.; Wang, T. Molecular evolution and SSRs analysis based on the chloroplast genome of Callitropsis funebris. Ecol. Evol. 2021, 11, 4786–4802. [Google Scholar] [CrossRef] [PubMed]
  14. Li, D.M.; Pan, Y.G.; Wu, X.Y.; Zou, S.P.; Wang, L.; Zhu, G.F. Comparative chloroplast genomics, phylogenetic relationships and molecular markers development of Aglaonema commutatum and seven green cultivars of Aglaonema. Sci. Rep. 2024, 14, 11820. [Google Scholar] [CrossRef]
  15. Li, W.; Liu, Y.; Yang, Y.; Xie, X.; Lu, Y.; Yang, Z.; Jin, X.; Dong, W.; Suo, Z. Interspecific chloroplast genome sequence diversity and genomic resources in Diospyros. BMC Plant Biol. 2018, 18, 210. [Google Scholar] [CrossRef] [PubMed]
  16. Li, H.; Xiao, W.; Tong, T.; Li, Y.; Zhang, M.; Lin, X.; Zou, X.; Wu, Q.; Guo, X. The specific DNA barcodes based on chloroplast genes for species identification of Orchidaceae plants. Sci. Rep. 2021, 11, 1424. [Google Scholar] [CrossRef] [PubMed]
  17. Yu, J.; Fu, J.; Fang, Y.; Xiang, J.; Dong, H. Complete chloroplast genomes of Rubus species (Rosaceae) and comparative analysis within the genus. BMC Genom. 2022, 23, 32. [Google Scholar] [CrossRef]
  18. Zhang, Y.M.; Han, L.J.; Yang, C.W.; Yin, Z.L.; Tian, X.; Qian, Z.G.; Li, G.D. Comparative chloroplast genome analysis of medicinally important Veratrum (Melanthiaceae) in China: Insights into genomic characterization and phylogenetic relationships. Plant Divers. 2022, 44, 70–82. [Google Scholar] [CrossRef]
  19. Doyle, J.J.; Doyle, J.L. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem. Bull. 1987, 19, 11–15. [Google Scholar]
  20. Chen, S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2023, 2, e107. [Google Scholar] [CrossRef]
  21. Bioinformatics, B. FastQC: A Quality Control Tool for High Throughput Sequence Data; Babraham Institute Cambridge: Cambridge, UK, 2011; Available online: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (accessed on 12 January 2025).
  22. Dierckxsens, N.; Mardulyn, P.; Smits, G. NOVOPlasty: De novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2017, 45, e18. [Google Scholar]
  23. Huang, D.I.; Cronk, Q.C. Plann: A command-line application for annotating plastome sequences. Appl. Plant Sci. 2015, 3, 1500026. [Google Scholar] [CrossRef] [PubMed]
  24. Greiner, S.; Lehwark, P.; Bock, R. OrganellarGenomeDRAW (OGDRAW) version 1.3. 1: Expanded toolkit for the graphical visualization of organellar genomes. Nucleic Acids Res. 2019, 47, 59–64. [Google Scholar] [CrossRef]
  25. Shields, D.C.; Sharp, P.M. Synonymous codon usage in Bacillus subtilis reflects both translational selection and mutational biases. Nucleic Acids Res. 1987, 15, 8023–8040. [Google Scholar] [CrossRef] [PubMed]
  26. Lenz, H.; Hein, A.; Knoop, V. Plant organelle RNA editing and its specificity factors: Enhancements of analyses and new database features in PREPACT 3.0. BMC Bioinform. 2018, 19, 255. [Google Scholar] [CrossRef]
  27. Kalia, R.K.; Rai, M.K.; Kalia, S.; Singh, R.; Dhawan, A.K. Microsatellite markers: An overview of the recent progress in plants. Euphytica 2011, 177, 309–334. [Google Scholar] [CrossRef]
  28. Beier, S.; Thiel, T.; Münch, T.; Scholz, U.; Mascher, M. MISA-web: A web server for microsatellite prediction. Bioinformatics 2017, 33, 2583–2585. [Google Scholar] [CrossRef] [PubMed]
  29. Kurtz, S.; Choudhuri, J.V.; Ohlebusch, E.; Schleiermacher, C.; Stoye, J.; Giegerich, R. REPuter: The manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001, 29, 4633–4642. [Google Scholar] [CrossRef]
  30. Benson, G. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 1999, 27, 573–580. [Google Scholar] [CrossRef]
  31. Zhu, A.; Guo, W.; Gupta, S.; Fan, W.; Mower, J.P. Evolutionary dynamics of the plastid inverted repeat: The effects of expansion, contraction, and loss on substitution rates. New Phytol. 2016, 209, 1747–1756. [Google Scholar] [CrossRef]
  32. Li, H.; Guo, Q.; Xu, L.; Gao, H.; Liu, L.; Zhou, X. CPJSdraw: Analysis and visualization of junction sites of chloroplast genomes. PeerJ 2023, 11, e15326. [Google Scholar] [CrossRef]
  33. Kearse, M.; Moir, R.; Wilson, A.; Stones-Havas, S.; Cheung, M.; Sturrock, S.; Buxton, S.; Cooper, A.; Markowitz, S.; Duran, C.; et al. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 2012, 28, 1647–1649. [Google Scholar] [CrossRef]
  34. Grant, J.R.; Enns, E.; Marinier, E.; Mandal, A.; Herman, E.K.; Chen, C.Y.; Graham, M.; Van Domselaar, G.; Stothard, P. Proksee: In-depth characterization and visualization of bacterial genomes. Nucleic Acids Res. 2023, 51, 484–492. [Google Scholar] [CrossRef] [PubMed]
  35. Frazer, K.A.; Pachter, L.; Poliakov, A.; Rubin, E.M.; Dubchak, I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2004, 32, 273–279. [Google Scholar] [CrossRef]
  36. Librado, P.; Rozas, J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics 2009, 25, 1451–1452. [Google Scholar] [CrossRef] [PubMed]
  37. Katoh, K.; Rozewicki, J.; Yamada, K.D. MAFFT online service: Multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 2018, 20, 1160–1166. [Google Scholar] [CrossRef]
  38. Stamatakis, A. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 2014, 30, 1312–1313. [Google Scholar] [CrossRef]
  39. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013, arXiv:1303.3997. [Google Scholar] [CrossRef]
  40. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R. The sequence alignment/map format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef]
  41. Wilm, A.; Aw, P.P.; Bertrand, D.; Yeo, G.H.; Ong, S.H.; Wong, C.H.; Khor, C.C.; Petric, R.; Hibberd, M.L.; Nagarajan, N. LoFreq: A sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 2012, 40, 11189–11201. [Google Scholar] [CrossRef]
  42. Daniell, H.; Lin, C.S.; Yu, M.; Chang, W.J. Chloroplast genomes: Diversity, evolution, and applications in genetic engineering. Genome Biol. 2016, 17, 134. [Google Scholar] [CrossRef]
  43. Wang, G.; Ren, Y.; Su, Y.; Zhang, H.; Li, J.; Han, J. Molecular marker development and phylogenetic analysis of Aconitum species based on chloroplast genomes. Ind. Crops Prod. 2024, 221, 119386. [Google Scholar] [CrossRef]
  44. Xue, H.; Xing, Y.; Bian, C.; Hou, W.; Men, W.; Zheng, H.; Yang, Y.; Ying, X.; Kang, T.; Xu, L. Comparative analysis of chloroplast genomes of Pulsatilla species reveals evolutionary and taxonomic status of newly discovered endangered species Pulsatilla saxatilis. BMC Plant Biol. 2024, 24, 293. [Google Scholar] [CrossRef]
  45. Miao, H.; Bao, J.; Li, X.; Ding, Z.; Tian, X. Comparative analyses of chloroplast genomes in ‘Red Fuji’ apples: Low rate of chloroplast genome mutations. PeerJ 2022, 10, e12927. [Google Scholar] [CrossRef]
  46. Li, Z.; Duan, B.; Zhou, Z.; Fang, H.; Yang, M.; Xia, C.; Zhou, Y.; Wang, J. Comparative analysis of medicinal plants Scutellaria baicalensis and common adulterants based on chloroplast genome sequencing. BMC Genom. 2024, 25, 39. [Google Scholar] [CrossRef] [PubMed]
  47. Li, Y.C.; Korol, A.B.; Fahima, T.; Beiles, A.; Nevo, E. Microsatellites: Genomic distribution, putative functions and mutational mechanisms: A review. Mol. Ecol. 2002, 11, 2453–2465. [Google Scholar] [CrossRef] [PubMed]
  48. Kapustová, V.; Tulpová, Z.; Toegelová, H.; Novák, P.; Macas, J.; Karafiátová, M.; Hřibová, E.; Doležel, J.; Šimková, H. The dark matter of large cereal genomes: Long tandem repeats. Int. J. Mol. Sci. 2019, 20, 2483. [Google Scholar] [CrossRef]
  49. Kumar, P.; Gupta, V.K.; Misra, A.K.; Modi, D.R.; Pandey, B.K. Potential of molecular markers in plant biotechnology. Plant Omics 2009, 2, 141–162. [Google Scholar]
  50. Hollingsworth, P.M.; Graham, S.W.; Little, D.P. Choosing and using a plant DNA barcode. PLoS ONE 2011, 6, e19254. [Google Scholar] [CrossRef] [PubMed]
  51. Hollingsworth, P.M.; Forrest, L.L.; Spouge, J.L.; Hajibabaei, M.; Ratnasingham, S.; van der Bank, M.; Chase, M.W.; Cowan, R.S.; Erickson, D.L.; Fazekas, A.J.; et al. A DNA barcode for land plants. Proc. Natl. Acad. Sci. USA 2009, 106, 12794–12797. [Google Scholar]
  52. Mehmood, F.; Shahzadi, I.; Waseem, S.; Mirza, B.; Ahmed, I.; Waheed, M.T. Chloroplast genome of Hibiscus rosa-sinensis (Malvaceae): Comparative analyses and identification of mutational hotspots. Genomics 2020, 112, 581–591. [Google Scholar] [CrossRef] [PubMed]
  53. Ramesh, G.A.; Mathew, D.; John, K.J.; Ravisankar, V. Chloroplast gene matK holds the barcodes for identification of Momordica (Cucurbitaceae) species from Indian subcontinent. Hortic Plant J. 2022, 8, 89–98. [Google Scholar] [CrossRef]
  54. Abouseada, H.H.; Mohamed, A.S.; Teleb, S.S.; Badr, A.; Tantawy, M.E.; Ibrahim, S.D.; Ellmouni, F.Y.; Ibrahim, M. Genetic diversity analysis in wheat cultivars using SCoT and ISSR markers, chloroplast DNA barcoding and grain SEM. BMC Plant Biol. 2023, 23, 193. [Google Scholar] [CrossRef]
  55. Park, J.; Min, J.; Kim, Y.; Chung, Y. The comparative analyses of six complete chloroplast genomes of morphologically diverse Chenopodium album L. (Amaranthaceae) collected in Korea. Int. J. Genom. 2021, 2021, 6643444. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Physical map of the Toona sinensis chloroplast genomes sequenced in this study. Genes are color-coded by functional category. The inner circle delineates large single-copy (LSC), small single-copy (SSC), and inverted repeat (IR) region boundaries, with GC content represented by dark gray shading.
Figure 1. Physical map of the Toona sinensis chloroplast genomes sequenced in this study. Genes are color-coded by functional category. The inner circle delineates large single-copy (LSC), small single-copy (SSC), and inverted repeat (IR) region boundaries, with GC content represented by dark gray shading.
Agronomy 16 00127 g001
Figure 2. Repeat element analysis in 15 Toona sinensis chloroplast genomes. (A) simple sequence repeat (SSR) motif counts. (B) Shared/unique SSR distribution. (C) Long repeat type frequencies. (D) Shared/unique long repeats. (E) Tandem repeat classification. (F) Shared/unique tandem repeats.
Figure 2. Repeat element analysis in 15 Toona sinensis chloroplast genomes. (A) simple sequence repeat (SSR) motif counts. (B) Shared/unique SSR distribution. (C) Long repeat type frequencies. (D) Shared/unique long repeats. (E) Tandem repeat classification. (F) Shared/unique tandem repeats.
Agronomy 16 00127 g002
Figure 3. Nucleotide diversity across Toona sinensis chloroplast genomes. The sliding window analysis reveals hypervariable (peaks) and conserved (troughs) regions along the chloroplast genome.
Figure 3. Nucleotide diversity across Toona sinensis chloroplast genomes. The sliding window analysis reveals hypervariable (peaks) and conserved (troughs) regions along the chloroplast genome.
Agronomy 16 00127 g003
Figure 4. Maximum likelihood phylogenetic trees of Toona sinensis accessions based on conventional chloroplast markers. (A) matK gene tree. (B) rbcL gene tree. (C) trnH-psbA intergenic spacer tree. (D) Concatenated analysis of all three markers. Bootstrap support values are indicated by different node symbols. The five-pointed stars, hexagons and triangles represent high support of 81–100%, moderate support of 61–80% and low support of <60%, respectively.
Figure 4. Maximum likelihood phylogenetic trees of Toona sinensis accessions based on conventional chloroplast markers. (A) matK gene tree. (B) rbcL gene tree. (C) trnH-psbA intergenic spacer tree. (D) Concatenated analysis of all three markers. Bootstrap support values are indicated by different node symbols. The five-pointed stars, hexagons and triangles represent high support of 81–100%, moderate support of 61–80% and low support of <60%, respectively.
Agronomy 16 00127 g004
Figure 5. Maximum likelihood phylogenetic trees of Toona sinensis accessions based on hypervariable regions. (A) ycf1 gene tree. (B) ndhF gene tree. (C) trnT-TGT–trnF-GAA intergenic spacer tree. (D) Concatenated analysis of all three markers. Bootstrap support values are indicated by different node symbols. The five-pointed stars, hexagons and triangles represent high support of 81–100%, moderate support of 61–80% and low support of <60%, respectively.
Figure 5. Maximum likelihood phylogenetic trees of Toona sinensis accessions based on hypervariable regions. (A) ycf1 gene tree. (B) ndhF gene tree. (C) trnT-TGT–trnF-GAA intergenic spacer tree. (D) Concatenated analysis of all three markers. Bootstrap support values are indicated by different node symbols. The five-pointed stars, hexagons and triangles represent high support of 81–100%, moderate support of 61–80% and low support of <60%, respectively.
Agronomy 16 00127 g005
Figure 6. Maximum likelihood phylogenetic trees of Toona sinensis accessions based on entire chloroplast genomes. Bootstrap support values are indicated by different node symbols. The five-pointed stars, hexagons and triangles represent high support of 81–100%, moderate support of 61–80% and low support of <60%, respectively.
Figure 6. Maximum likelihood phylogenetic trees of Toona sinensis accessions based on entire chloroplast genomes. Bootstrap support values are indicated by different node symbols. The five-pointed stars, hexagons and triangles represent high support of 81–100%, moderate support of 61–80% and low support of <60%, respectively.
Agronomy 16 00127 g006
Figure 7. Sanger sequencing validation of representative molecular markers in T. sinensis. (A,B) Two SSR loci, (TTAGGA)n and (TCCTAA)n, showing repeat copy number variation among varieties. The LW variety consistently exhibits three repeat units, whereas other varieties show two repeat units. (C,D) Two tandem repeat loci showing variety-specific patterns. Both tandem repeats are present with two repeat units exclusively in the JZ variety and are absent in other varieties.
Figure 7. Sanger sequencing validation of representative molecular markers in T. sinensis. (A,B) Two SSR loci, (TTAGGA)n and (TCCTAA)n, showing repeat copy number variation among varieties. The LW variety consistently exhibits three repeat units, whereas other varieties show two repeat units. (C,D) Two tandem repeat loci showing variety-specific patterns. Both tandem repeats are present with two repeat units exclusively in the JZ variety and are absent in other varieties.
Agronomy 16 00127 g007
Table 1. Sequence characteristics of the 15 Toona sinensis chloroplast genomes analyzed in this study.
Table 1. Sequence characteristics of the 15 Toona sinensis chloroplast genomes analyzed in this study.
Accession NumberTotal Size (bp)LSC Length (bp)SSC Length (bp)IR Length (bp)Overall GC Content (%)LSC GC Content (%)SSC GC Content (%)IR GC Content (%)
JZ1159,27886,97018,34626,98137.8836.0432.2242.79
JZ2159,27886,92218,34627,00537.8836.0332.2242.79
JZ3159,30086,92218,34627,01637.8836.0332.2242.79
LQ1159,28686,91618,33427,01837.8936.0432.2442.78
LQ2159,28686,97818,33426,98737.8936.0532.2442.78
LQ3159,28686,91418,33427,01937.8936.0432.2442.78
LW1159,25286,89418,33427,01237.8936.0532.2242.78
LW2159,25286,89818,33427,01037.8936.0532.2242.78
LW3159,25286,89018,33427,01437.8936.0532.2242.78
QZ1159,29086,97818,33426,98937.8936.0432.2442.78
QZ2159,31187,00718,33426,98937.8936.0432.2442.78
QZ3159,28686,96218,33426,99537.8936.0432.2442.78
HB1159,28186,94718,33227,00137.8936.0432.2442.78
HB2159,28186,98518,33226,98237.8936.0532.2442.78
HB3159,28186,91118,33227,01937.8936.0432.2442.78
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, S.; Du, P.; Lin, H.; Wang, M.; Li, R. Chloroplast Genome-Based Insights into Variety Identification in Toona sinensis. Agronomy 2026, 16, 127. https://doi.org/10.3390/agronomy16010127

AMA Style

Zhang S, Du P, Lin H, Wang M, Li R. Chloroplast Genome-Based Insights into Variety Identification in Toona sinensis. Agronomy. 2026; 16(1):127. https://doi.org/10.3390/agronomy16010127

Chicago/Turabian Style

Zhang, Shuqiao, Panyue Du, Hongqiang Lin, Mingcheng Wang, and Rui Li. 2026. "Chloroplast Genome-Based Insights into Variety Identification in Toona sinensis" Agronomy 16, no. 1: 127. https://doi.org/10.3390/agronomy16010127

APA Style

Zhang, S., Du, P., Lin, H., Wang, M., & Li, R. (2026). Chloroplast Genome-Based Insights into Variety Identification in Toona sinensis. Agronomy, 16(1), 127. https://doi.org/10.3390/agronomy16010127

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop