DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops

Samarina, Lidiia S.; Koninskaya, Natalia G.; Shkhalakhova, Ruset M.; Simonyan, Taisiya A.; Kuzmina, Daria O.

doi:10.3390/ijms26146808

Open AccessReview

DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops

by

Lidiia S. Samarina

^*

,

Natalia G. Koninskaya

,

Ruset M. Shkhalakhova

,

Taisiya A. Simonyan

and

Daria O. Kuzmina

Federal Research Centre the Subtropical Scientific Centre of the Russian Academy of Sciences, 354002 Sochi, Russia

^*

Author to whom correspondence should be addressed.

Int. J. Mol. Sci. 2025, 26(14), 6808; https://doi.org/10.3390/ijms26146808

Submission received: 5 June 2025 / Revised: 10 July 2025 / Accepted: 14 July 2025 / Published: 16 July 2025

(This article belongs to the Special Issue Plant Biology and Biotechnology: Focus on Genomics, Bioinformatics and AI)

Download

Browse Figures

Review Reports Versions Notes

Abstract

DNA barcoding of intraspecific diversity of agricultural crops is important to develop the genetic passports of valuable genotypes and cultivars. The advantage of DNA-barcoding as compared to traditional genotyping of cultivars is that the procedure can be unified and applied for the broad range of accessions. This not only makes it cost efficient, but also allows to develop open access genetic databases to accumulate information of the world’s germplasm collections of different crops. In this regard, the aim of the review was to analyze the latest research in this field, including the selection of loci, universal primers, strategies of amplicons analysis, bioinformatic tools, and the development of databases. We reviewed the advantages and disadvantages of each strategy with the focus of cultivars identification. The data indicates that following chloroplast loci are the most prominent for the intraspecific diversity analysis: (trnE-UUC/trnT-GUU, rpl23/rpl2.l, psbA-trnH, trnL-trnF, trnK, rpoC1, ycf1-a, rpl32-trnL, trnH-psbA and matK). We suggest that the combination of three or four of these loci can be a sufficient DNA barcode for cultivar-level identification. This combination has to be selected for each crop. Advantages and disadvantages of different approaches of amplicons analysis are discussed. The bioinformatic tools and databases for the plant barcoding are reviewed. This review will be useful for selecting appropriate strategies for barcoding of intraspecific diversity of agricultural crops to develop genetic passports of valuable cultivars in germplasm collections worldwide.

Keywords:

cultivar identification; barcoding libraries; chloroplast DNA; mitochondrial DNA; intraspecific diversity; genetic passport; next generation sequencing; bioinformatic tools

1. Introduction

DNA barcoding, also known as genetic barcoding, is a molecular identification method that allows us to distinguish taxa using several standardized regions of DNA [1]. This method is used to characterize existing biodiversity to identify new species and genotypes [2]. The development of DNA barcodes for valuable cultivars of local germplasm collections is important for their tracking and analyzing genetic relationships. In addition, DNA barcoding is an important tool for studying the evolution, ecology, and conservation of plants, especially given that biodiversity is threatened by anthropogenic activities and unfavorable climatic factors [3,4,5].

The use of DNA barcoding for the identification and conservation of valuable varieties has attracted considerable attention in recent years, especially due to the strengthening of phytosanitary measures and biodiversity management. Most of the previously published reviews on the topic of DNA barcoding are devoted to plant identification at the species level [5,6,7,8,9,10,11]. To address the problem of cultivar identification, it is necessary to review recently published DNA barcoding studies that were efficient at the genotype level. In this regard, the aim of the current review was to analyze the latest research in this field, including the selection of loci, primers, strategies of amplicons analysis, bioinformatic tools, and databases development.

The advantage of DNA-barcoding as compared to traditional genotyping of cultivars is that the procedure can be unified and applied for the broad range of accessions. This not only makes it cost efficient, but also allows to develop open access genetic databases to accumulate information of the world’s germplasm collections of different crops. The cultivars’ barcoding is important for the development of genetic passports for molecular identification of cultivars of different agricultural crops. Genetic passports will help protect the copyrights of breeders, protect farmers from buying counterfeit products, and obtain reliable and complete information about a particular genotype. Due to the increasing number of cases of illegal commercialization of selected cultivars, the protection of plant breeders’ rights has become of paramount importance for breeding companies [12]. For example, single-nucleotide variants were identified among 15 lavender varieties and allowed for the assessment of the genetic identities of individual to protect Mediterranean breeding lines and confirm their origin [12].

The development of genetic passports of valuable genotypes depends on the selection of reliable strategies for DNA barcoding, including the correct choice of the target loci, suitable primers for particular crop, amplicon analysis strategy, bioinformatic tools, and the development of the databases of plant ex situ collections (Figure 1). Further we discuss each of these steps to help to select the efficient methodology which provides accurate, time- and labor-saving protocols for intraspecific diversity characterization of valuable crops. This review will be of interest for researchers who deal with germplasm collections, characterization, and breeding research.

2. Selection of an Appropriate Loci

The selection of a loci for the barcoding should follow the criteria of universality and high discriminatory power. Universality means that the locus is efficient for a wide range of species and genera, while discriminatory power is the ability to identify closely related genotypes. Methods based on mitochondrial DNA (mtDNA), chloroplast DNA (cpDNA), and nuclear DNA are used for barcoding land plants. Furthermore, we discuss the advantages, disadvantages, and some examples of them for cultivars identification.

2.1. Mitochondrial DNA for Cultivars Identification

Mitochondrial DNA has certain advantages, such as maternal inheritance and the potential for whole-genome resequencing. However, its application in plant cultivar barcoding is limited by low variability and structural conservatism, which reduce their discriminatory power at the cultivar level [13,14]. Consequently, the usefulness of mitochondrial DNA for plant cultivar identification remains controversial, with some studies pointing to limited resolution at the cultivar level. Moreover, a recent study reported that 35% of the ancestral plastid genomes were transferred to mitochondrial genomes over the past 10 million years. The authors reported that some plastid barcoding markers co-amplified the mitochondrial DNA and caused a mis-authentication of plants [15].

Despite this, few recent studies show the efficiency of mtDNA as an alternative or complementary loci for cultivar level identification. Particularly, the cox2 intron II region of mtDNA was efficiently used for the characterization of intraspecific diversity of Panax ginseng [16]. Additionally, whole mitochondrial genome analysis allowed for the identification of 156 unique mitochondrial SNPs in the collection of 171 cultivars of Tunisian date palm cultivars [14]. We have not found any other examples that used mtDNA for cultivar identification. Meanwhile, the analysis of complete mitochondrial genomes can probably be efficient to reveal polymorphisms at low taxonomic levels.

2.2. Chloroplast DNA for Cultivars Identification

The chloroplast genome, a closed circular DNA molecule present in multiple copies in plant cells [17], exhibits a high copy number, making it readily accessible for analysis, even from degraded samples [18]. The cpDNA contains a combination of conserved and variable regions useful for both phylogenetic studies and barcoding, allowing for the differentiation of closely related cultivars [14,19]. In 2009, a large group of plant DNA barcoding specialists proposed chloroplast genes rbcL and matK as the universal barcode. However, this universal barcode was often insufficient for intraspecific diversity analysis and had been met with great skepticism since its proposal [20]. Later, several other loci of cpDNA were proposed for plant DNA barcoding [21,22,23,24,25,26]. In 2011, the polymorphism of several plastid loci (rpl23&rpl2.1, 16S, 23S, 4.5S&5S, petB&D and rpl2, rpoC1 and trnK introns) was checked in 96 different plant species, and it was found that the trnK and rpoC1 introns are the most variable loci in closely related species, showing their high potential for barcoding [23]. Our recent results on four subtropical crops (actinidia, feijoa, mandarins, and tea) confirmed the efficiency of several chloroplast loci (23S,4.5S/5S, 16S, rpl23/rpl2.l, rpoC1 intron, trnE-UUC/trnT-GUU) for the cultivars’ barcoding. Among them, rpl23/rpl2.l and trnE-UUC/trnT-GUU showed the highest intraspecific polymorphisms for all crops, while rpl2 intron and 16S displayed the lowest polymorphism levels in the generated fragments [27].

The intergenic regions of the cpDNA are known for their greater polymorphism as compared to genes. According to Dong et al. [28], analysis of chloroplast genomes in 12 angiosperm genera revealed the top 5% most variable loci. The 23 loci in which the most variable were, in order from highest to lowest variability, the intergenic regions ycf1-a, trnK, rpl32-trnL, and trnH-psbA, followed by trnSUGA-trnGUCC, petA-psbJ, rps16-trnQ, ndhC-trnV, ycf1-b, ndhF, rpoB-trnC, psbE-petL, and rbcL-accD. Three loci, trnSUGA-trnGUCC, trnT-psbD, and trnW-psaJ, showed very high nucleotide diversity per site across three genera [28]. Another study reported that the trnH-psbA barcode region showed greater resolving power compared to rbcL and matK in several plant species [29]. However, some intergenic regions are not always efficient enough for cultivars identification. Particularly, the trnH-psbA and trnL-trnF intergenic spacers were able to distinguish and identify only four among 25 Prunus accessions [30]. Another research showed that trnH–psbA is not suitable for some taxa due to intraspecific inversions and rps19 insertions, which inflate intraspecific variation [31]. Particularly, trnH-psbA was efficient for the discrimination of Physalis species, but intraspecific polymorphism was low [32]. In addition, most of the intergenic regions were not efficient for Triticum barcoding. However, a combination of the intergenic regions trnfM-trnT with either trnD-psbM, petN-trnC, matK-rps16, or rbcL-psaI demonstrated a very high discrimination capacity [33]. These results confirm the necessity of crop-specific selection of loci for the cultivars’ barcoding. To summarize, many researchers have effectively used cpDNA in cultivar identification of various plant species, including Vitis [19], Medicago sativa [34], Scutellaria baicalensis [35], Chrysanthemum [36], Phoenix [13], Orchidaceae [37], and Ficus [38]. These examples demonstrate the broad applicability of cpDNA barcoding across various plant families and cultivars.

Although cpDNA barcoding offers significant advantages, it is important to consider the limitations of this approach. The relatively slow mutation rate in certain cpDNA regions may limit the resolution for discrimination between closely related cultivars. Therefore, a combination of cpDNA loci with other genetic markers is necessary for optimal cultivar discrimination [34]. There are still challenges in standardizing cpDNA markers and establishing unified protocols for barcoding different plant species in different laboratories, highlighting the need for further improvement of these methodologies.

2.3. Nuclear DNA for Cultivars Identification

Among the various DNA regions for plant barcoding, the Internal Transcribed Spacer (ITS) has confirmed its efficiency for many species and was proposed as universal barcode [39,40,41]. The ITS region is a non-coding region located in the ribosomal DNA repeat unit of the nuclear genome [40]. It is flanked by the highly conserved 18S, 5.8S, and 28S ribosomal RNA genes. The ITS region consists of two segments, ITS1 and ITS2, separated by the 5.8S gene. These spacer regions are transcribed but removed during the processing of ribosomal RNA. The ITS region offers several advantages for DNA barcoding applications: (1) high sequence variability, making it suitable for distinguishing closely related species and cultivars [39,40,41,42]; (2) availability of universal primers transferable to a broad range of plant taxa [43]; (3) extensive databases with a large number of ITS sequences [43]; and (4) high success rates in identifying plant species and cultivars using the ITS region [39,44].

Several recent studies have explored the use of ITS barcoding for cultivars discrimination. High discrimination power of ITS sequences was demonstrated in pineapple [44], cassava [41], banana [43], apricot [45], chili pepper [46], and mango [47]. However, studies on pomegranate [48], fig [49], tea plant, and citrus [26] have found the ITS region to be ineffective in distinguishing across cultivars. The limitations of ITS barcoding are as follows: (1) copy number variation: ITS copy number variation can lead to inaccurate phylogenetic inferences; (2) PCR bias: PCR amplification can be biased towards certain ITS variants; (3) hybridization and polyploidy: ITS sequences may not be reliable for identifying hybrids or polyploid species; (4) reference library needs: Kjærandsen [50] highlights the need to build high quality reference libraries; and (5) resource-consuming data analysis: data analysis of sequencing reads is time-consuming to avoid errors and “noises”. These reasons make the unification of protocols in the large-scale experiments difficult. Thus, as compared to the ITS region, cpDNA loci can be a more efficient barcode due to simplicity of amplification, sequencing, and data analysis, as well as greater transferability of primers among different genera.

2.4. Multilocus DNA Barcoding Systems for Cultivars Identification

The problem of creating a universal barcode for land plants remains unresolved, since no single locus or combination has been generally accepted. Some researchers have concluded that single-locus barcodes often do not have sufficient resolution to distinguish closely related cultivars or species. In this regard, the use of multilocus DNA barcoding is becoming a popular approach for the accurate identification and discrimination of plant cultivars. For example, the need for a multilocus approach has been highlighted for mango cultivars [51] and closely related clerodendron species [52]. In this study, a combination of the nuclear ITS region with several cpDNA loci (matK, rbcL, ycf1) was evaluated, and the best combination was ITS + matK. This combination was also efficient for grasses (Agropyron, Bromus, Elymus, Elytrigia, Festuca, Leymus, and Lolium) [53] and medicinal orchids [54,55]. However, for Olive cultivars, matK was the least variable compared to rbcL. Variability levels of 77% and 98.7% were observed for matK (fragment of 816 bp) and rbcL (the fragment of 411 bp), respectively [56].

Other studies also proposed two-loci DNA barcodes. For example, the effectiveness of a combination of ITS2 and trnH-psbA was confirmed for cultivars discrimination of saffron [57], pomegranate [48], and forage plants [58]. Additionally, the high efficiency of the three-loci barcode (ITS2 + psbA-trnH + trnL-trnF), with an identification rate of 93.6%, was established for nine species of Syringa; intraspecific diversity was revealed by each of the loci [59].

Some researchers proposed a combination of traditional genotyping techniques [60,61] and morphological, chemical data [10,62] with the DNA barcoding, leading to more robust variety of identification systems. For example, Wang et al. [63] conducted a comprehensive study of Citri Reticulatae Pericarpium (CRP) using an integrated approach that combined LC-QTOF MS-based metabolomics and DNA barcoding. Their study successfully identified chemical markers that differentiated the Guangchenpi and Chenpi subtypes, demonstrating the effectiveness of combining chemical profiling with molecular methods for cultivars differentiation. Similarly, the development of nomenclatural standards and genotyping methods for potato cultivars [64] highlights the importance of combining traditional taxonomic approaches with modern molecular techniques such as cpDNA analysis.

To summarize, the choice of a strategy for cultivars identification fluctuates between two criteria: (1) labor/time intensity and (2) reliability of identification. Combining different methods is a more labor-intensive approach, limiting its application in the large-scale experiments and making the analysis of a large number of samples difficult. On the other hand, choosing only one or two loci has been associated with the risk of poor reliability of cultivar identification. Despite this, multilocus approaches may become an integral part of modern plant taxonomy and variety identification, ensuring accurate identification and protection of plant genetic resources [54,65,66].

3. Selection of Primers

Primer selection is a fundamental step for successful amplification. An important criterion that simplifies the barcoding procedure is the universality of the primers, i.e., their transferability to a large number of plant species and genera. A recent study evaluated the transferability of rbcL and matK primer sequences in silico using R package “OpenprimeR”. In total, 366 and 489 different primers were found for rbcL and matK, respectively. These primers were tested in 8463 species. When evaluating the primer-sequence correspondence, the primers with the highest sequence coverage were 96.39% and 93.81% forward and reverse for rbcL and 91.56% and 61.62% forward and reverse for matK. No universal primer was found for all land plants, but two pairs of rbcL primers were able to amplify >99% of the sequences. In contrast to the results obtained for the matK region, the 10 pairs optimized for the highest sequence coverage did not cover >85% of the sequences [23].

In silico analysis of existing ITS primers based on highly representative datasets showed that primers universal to this region are suitable for more than 95% of plant species in most groups. A total of 335 samples from 219 angiosperm families, 11 gymnosperm families, 24 fern and lycophyte families, 16 moss families, and 17 fungi families were used to test the performance of these primers [67].

To check the transferability of commonly used primers for cpDNA and for ITS region, we used NCBI Primer Blast (Table 1, Supplementary Table S1). The number of accessions for each primer pair varied from 57 (for ITS) to 915 (for trnE-UUC/trnT-GUU) (Figure 2 top). The greatest variability of the product size (dispersion of 3300–3400 bp) was observed for the ITS-targeted primer pairs. The lowest variability of the amplicon length was observed for rbcLa F/rbcLr 590, petB/petD, matK, and 16S (Figure 2 bottom). It seems that there is no correlation between the size of the product and the number of accessions in the NBCI database.

To summarize, large-scale barcoding experiments need the uniformity of protocols, including the application of universal primers, resulting in similar fragment length and highly polymorphic loci. According to the literature data, the most variable cpDNA-loci and primers for intraspecific analysis are proposed: trnE-UUC/trnT-GUU, rpl23/rpl2.l, psbA-trnH, trnL-trnF, trnK, rpoC1, ycf1-a, rpl32-trnL, trnH-psbA, and matK. We suggest that the combination of three of these loci can be a sufficient DNA barcode for cultivar-level identification of agricultural crops (Figure 3). The best combination has to be selected for each crop. The addition of ITS loci can be efficient in some cases. However, it seems that ITS-targeted primers are often not efficient for many species, as they show high level of size variability of the product, which is the constraints for the uniformity of the barcoding procedure. Among the universal chloroplast primers, trnE-UUC/trnT-GUU are transferable for the greatest number of the species and show high polymorphism among closely related genotypes [26].

4. Strategies of Amplicon Analysis

4.1. Restriction Analysis

The restriction analysis of amplicons leverages the principles of restriction fragment length polymorphism (RFLP) to differentiate between genetic variants within a species, providing a reliable method for cultivar identification and classification. RFLP is a technique that involves the digestion of amplified DNA fragments with specific restriction enzymes that cut the DNA at known sequences. The resulting fragments are then separated by size using gel electrophoresis, allowing for the visualization of distinct patterns that can be used to differentiate between cultivars. For example, Wolff et al. [76] demonstrated that RFLP could effectively distinguish between chrysanthemum cultivars, highlighting the stability of DNA fingerprint patterns. Their study on soybean cultivars revealed that combining endonuclease cleavage with amplification could significantly improve the resolution of genetic markers, facilitating the identification of closely related cultivars. The integration of multiple endonuclease digestions has been shown to enhance the detection of polymorphic DNA [77].

The challenges of RFLP are related to the choice of restriction enzymes, as enzymes that recognize similar sites may yield indistinguishable patterns, complicating the analysis. Additionally, the quality of the PCR amplicons is paramount; impurities can lead to unreliable results. Careful experimental design and validation are necessary to ensure the accuracy of the results [78]. Furthermore, the reproducibility of amplicon sequencing-based detection methods has been a concern.

4.2. High Resolution Melting for DNA Barcoding of Cultivars

High-resolution melting (HRM) has emerged as a powerful technique for the identification and authentication of various plant cultivars through DNA barcoding. This method leverages the unique melting profiles of DNA fragments to distinguish between closely related species and cultivars. The melting curve generated provides a unique profile for each DNA sample, allowing for the differentiation of genotypes based on their melting behavior. This technique is sensitive to even minor variations in DNA sequences, making it suitable for detecting single nucleotide polymorphisms (SNPs) and other genetic variations. There are several examples of using HRM for cultivars identification. Muleo et al. [79] demonstrated the effectiveness of HRM analysis in the genotyping of olive germplasm. The study highlighted the method’s ability to detect both homozygous and heterozygous mutations, confirming its high sensitivity and reproducibility for cultivar identification [79]. Jaakola et al. [80] applied HRM analysis for the authentication of bilberry genotypes. In a study by Bosmali et al. [81], HRM analysis was integrated with microsatellite markers and DNA barcoding to distinguish between different lentil varieties and detect admixtures. Madesis et al. [82] reported the application of Bar-HRM analysis for the authentication of bean crops without prior DNA purification. This method proved effective in identifying and quantifying major Greek and Mediterranean bean genotypes [82].

While HRM analysis presents numerous advantages for cultivar identification, challenges remain in optimizing PCR conditions and primer design to maximize melting curve variability. Future research should focus on refining these parameters and exploring the integration of HRM with next-generation sequencing technologies to enhance the resolution and accuracy of cultivar identification.

4.3. Sanger Sequencing for Amplicon Analysis

Sanger sequencing, a pioneering technique in the field of molecular biology, has emerged as a valuable tool for cultivars barcoding. Recent studies have underscored the efficacy of Sanger sequencing for amplicon analysis in cultivars barcoding. For instance, the successful application of Sanger sequencing to distinguish various banana cultivars has been demonstrated recently [83]. The researchers employed a combination of universal primers and cultivar-specific primers to amplify and sequence the ITS region, yielding a high degree of accuracy in cultivar identification. Recently, we also efficiently used Sanger sequencing to reveal structural cpDNA loci and an ITS region in four subtropical crops [26]. The results help accurately distinguish the cultivars within each species.

Despite the advances made using Sanger sequencing-based amplicon analysis, contrasting perspectives and challenges persist. The method can be time-consuming and costly, particularly when dealing with large sample sizes. Some researchers have argued that the technique is limited by its relatively low throughput and high cost compared to newer sequencing technologies. However, Sanger sequencing remains a reliable method for initial screenings and for generating high-quality reference sequences. Its high accuracy and reliability make it an indispensable tool for amplicon analysis, particularly in situations where precise cultivar identification is critical [83]. The integration of Sanger sequencing with other molecular techniques can enhance the overall effectiveness of DNA barcoding. For instance, combining Sanger sequencing with high-resolution melting analysis or real-time PCR can improve the accuracy of species identification, particularly in complex samples [84]. Furthermore, the development of multiplex assays, as demonstrated by Richardson et al. [85], allows for the simultaneous identification of multiple species, streamlining the process of cultivar authentication.

4.4. Next Generation Sequencing (NGS) for Cultivars Identification

The emergence of NGS has significantly enhanced the efficiency and accuracy of DNA barcoding, enabling the simultaneous sequencing of multiple samples and the generation of vast amounts of data. Meta-barcoding, a technique combining NGS with DNA barcoding, has been successfully applied to complex herbal formulations. Pandit et al. [86] demonstrated the use of rbcL gene-based mini-barcodes in high-throughput sequencing to detect plant species in polyherbal products, highlighting NGS’s capacity to analyze multiple species simultaneously. This approach can be adapted for cultivar-level discrimination within plant species, enhancing quality control and authentication processes. Moreover, Shokralla et al. [87] showcased the potential of NGS in enhancing DNA barcode capture from single specimens, allowing for the identification of cryptic species and improving the resolution of taxonomic classifications. NGS facilitates rapid and large-scale sequencing efforts necessary for expanding these reference databases, which are essential for reliable cultivar identification.

Despite the advancements in NGS technologies, several challenges remain in the field of cultivar barcoding. The complexity of plant genomes, particularly in polyploid species, can complicate the interpretation of sequencing data. Additionally, the need for robust reference databases to support accurate species identification is critical. As noted by Antil et al. [88], constructing a comprehensive reference library is essential for the effective application of DNA barcoding. Future research should focus on optimizing NGS protocols to enhance the accuracy and efficiency of cultivar barcoding. The integration of advanced computational tools for data analysis and the development of standardized protocols will be crucial in addressing the current limitations.

4.4.1. Pooled Amplicon Sequencing for Cultivar Identification

Pooled amplicon sequencing has emerged as a powerful tool for cultivar identification, leveraging the advantages of next-generation sequencing (NGS) technologies to provide high-resolution genetic information. This method allows for the simultaneous analysis of multiple samples to distinguish between closely related cultivars. The ability to sequence multiple samples in a single run significantly reduces costs and increases throughput, making it an attractive option for large-scale studies.

Jia et al. [89] emphasized the challenges posed by sequencing errors and cross-contamination in amplicon studies, underscoring the need for robust methodologies to ensure accurate identification of rare taxa. Pooled amplicon sequencing can mitigate these issues by providing a more comprehensive view of genetic diversity. The application of pooled amplicon sequencing in cultivar identification has been demonstrated in various studies. Particularly, Urra et al. [90] applied high-throughput amplicon sequencing to identify grapevine clones, revealing genetic variations among clonal selections of Vitis vinifera cultivars. In rye, pooled sample comprising DNA of 96 individual plants was efficiently used to evaluate for sequence variation in six candidate genes [91]. Our recent study on tea plant showed the efficiency of pooled amplicon sequencing to reveal SNPs and InDels in the amplified genes of different cultivars [92].

Despite its advantages, pooled amplicon sequencing is not without challenges. The choice of target regions and the design of primers can significantly impact the quality and resolution of the data obtained. For instance, Whitford et al. [93] demonstrated that the choice of taxonomic database and variable region of the 16S rRNA gene sequence is critical for achieving accurate species-level identification. Therefore, careful consideration must be given to experimental design to maximize the effectiveness of pooled amplicon sequencing for cultivar identification. Additionally, the potential for sequencing biases, such as those related to GC content and template length, must be addressed. Whitford et al. [94] highlighted the importance of understanding these biases to improve the accuracy of variant discovery using multiplex amplicon sequencing. As the field of genomics continues to evolve, pooled amplicon sequencing is poised to play a pivotal role in cultivar identification and characterization. The integration of advanced sequencing technologies, such as Oxford Nanopore and PacBio, offers the potential for even greater resolution and accuracy in identifying genetic variations among cultivars [95].

4.4.2. Genotyping-by-Sequencing for Cultivars Identification

Genotyping-by-sequencing (GBS) has emerged as a powerful tool for cultivar identification across various plant species. This method leverages NGS technologies to generate SNPs that can be used to assess genetic diversity, establish relationships among cultivars, and facilitate marker-assisted selection in breeding programs.

GBS has been utilized in a variety of crops to enhance the accuracy and efficiency of cultivar identification. For instance, Uitdewilligen et al. [96] used GBS to assess genomic DNA variation in autotetraploid potato cultivars. Their study highlighted the ability to detect numerous sequence variants, which is crucial for cultivar identification and breeding programs. Particularly, Niimi et al. [97] created a set of 10 SNP markers for acid citrus cultivars, enabling the discrimination of 85 different cultivars. Similarly, Park et al. [98] developed a core set of SNP markers for Korean watermelon cultivars. Similarly, Meng et al. [99] utilized GBS to identify commercial cultivars in the Tabebuia alliance, providing a robust basis for patent protection and clarifying the genetic background of these cultivars.

While GBS has proven to be a valuable tool for cultivar identification, challenges remain. The need for high-quality reference genomes and the potential for sequencing errors can complicate data interpretation. However, ongoing advancements in sequencing technologies and bioinformatics tools are expected to address these issues, making GBS an even more powerful resource for plant breeders and geneticists.

4.4.3. RAD-Seq Approach for Cultivar Identification

Recent advancements in molecular techniques, particularly the restriction site-associated DNA sequencing (RAD-seq) approach, have revolutionized the way cultivars are identified. RAD-seq is an NGS-technique that allows for the identification of genetic variation across a wide range of species, including those with limited genomic resources. By focusing on specific regions of the genome, RAD-seq generates a large number of single nucleotide polymorphisms (SNPs) that can be used for cultivar identification. This method is particularly advantageous for non-model organisms, where reference genomes may not be available.

The application of RAD-seq has been demonstrated in various plant species, including lavender [100], ryegrass [101], and rhododendron [102]. For instance, a study on Italian ryegrass (Lolium multiflorum) utilized RAD-seq to distinguish between 11 varieties, revealing significant genetic differences [101]. Similarly, in tobacco, RAD-seq was employed to develop insertion–deletion (InDel) markers, which are valuable for genetic studies and marker-assisted selection breeding [103]. The results indicated that RAD-seq could effectively capture the genetic intraspecific diversity, facilitating accurate cultivars identification. Comparative studies have highlighted the advantages of RAD-seq over other genomic approaches [104]. The combination of RAD-seq with machine learning algorithms has shown promise in accurately classifying cultivars based on spectral data, as demonstrated in studies involving cotton [105] and olive [106] cultivars. These approaches leverage the strengths of both molecular and computational techniques to improve identification accuracy.

A critical analysis of RAD-sec perspectives reveals that the choice of restriction enzyme, sequencing depth, and data analysis pipeline can significantly impact the accuracy of RAD-seq-based cultivar identification. As we move forward, future research directions should focus on optimizing RAD-seq protocols for specific crops, exploring the integration of RAD-seq with other omics technologies and developing more sophisticated data analysis tools to fully harness the power of this approach.

4.4.4. Oxford Nanopore Technologies for Cultivars Identification

Oxford Nanopore Technologies (ONT) has emerged as a versatile platform for various genomic applications, including cultivar barcoding, owing to its capacity for long-read sequencing, and portability. One of the foundational advantages of ONT for cultivar barcoding is its capacity to produce long reads that encompass entire barcode regions, such as the ITS and chloroplast genomes, which are critical for species and cultivar identification.

The ability to generate comprehensive genomic datasets using ONT has been demonstrated in chloroplast genome sequencing, which is often used as a plant barcode. Wahyuni et al. [107] successfully assembled a draft chloroplast genome of Dryobalanops aromatica using ONT data. Their work highlights the potential of ONT to produce genome-scale data that can be used for detailed phylogenetic and cultivar differentiation analyses. Similarly, Aury et al. [108] reported on the assembly of the hexaploid wheat genome using long reads from ONT, achieving high-resolution assemblies suitable for research and breeding. These studies underscore the capacity of ONT to generate high-quality, long-read data that are instrumental in resolving complex plant genomes and distinguishing cultivars at the genomic level.

The body of research underscores the significant potential of Oxford Nanopore sequencing in plant cultivar barcoding. Its ability to generate long reads covering entire barcode regions, coupled with advanced error correction and demultiplexing tools, makes it a powerful platform for rapid, cost-effective, and high-throughput cultivar identification. As bioinformatics tools continue to evolve, the accuracy and scalability of ONT-based barcoding are expected to improve further, solidifying its role in plant genetics, breeding, and conservation efforts. The versatility demonstrated across various studies indicates that ONT is well-positioned to become a standard tool in the molecular identification and phylogenetic analysis of cultivars.

4.4.5. Application of Whole-Genome Resequencing as Super-Barcode

In recent years, whole genome resequencing has taken center stage, allowing for the comprehensive analysis of a plant’s genetic makeup. This has become possible due to development of NGS technologies. However, the whole genome resequencing of the nuclear genome is prohibitively expensive for widespread adoption, particularly in resource-limited laboratories. A critical analysis of these perspectives reveals that the key to successful implementation lies in balancing technological advancement with economic viability. As the cost of whole genome resequencing continues to decrease, its application in cultivar identification is likely to become more feasible [109]. Furthermore, the development of more efficient bioinformatic tools and pipelines will be essential for analyzing the vast amounts of data generated by whole genome resequencing [110]. As this technology continues to evolve, it is likely to play an increasingly vital role in agriculture and horticulture, enabling the precise identification and conservation of valuable crop cultivars.

The whole resequencing of the chloroplast and mitochondrial genomes is often used as an efficient approach for DNA-barcoding on intraspecific diversity. Recent barcoding studies have placed high emphasis on the use of whole-chloroplast genome sequences as a “super-barcode”, which has become increasingly feasible due to advancements in sequencing technologies [111]. For instance, Zhang et al. [112] conducted a comparative analysis of the chloroplast genome of Lonicera japonica cultivars. Whole-genome sequencing of cpDNA has been used to identify tea (Camellia sinensis) cultivars, providing valuable phylogenomic information [113]. In addition, complete chloroplast genome sequences have been used to identify the medicinal plant Scutellaria baicalensis [35]. For the authentic rice genotypes, the whole chloroplast genome trnL-F region was the most reliable barcode, although it required extensive sequencing and informatic analyses [114]. While whole chloroplast genome sequencing can already deliver a reliable meta-barcode for accurate plant identification, it is not yet resource-effective and does not yet offer the speed of analysis provided by single-locus barcodes to unspecialized laboratory facilities [33].

5. Bioinformatic Tools for Barcoding Data Analysis

The analysis of DNA barcoding data requires robust bioinformatic tools that can handle large datasets generated from sequencing technologies. These tools assist in various stages of the barcoding process, including sequence alignment, phylogenetic analysis, and data visualization. Bioinformatic tools such as MEGA (Molecular Evolutionary Genetics Analysis) and Clustal Omega are commonly used for sequence alignment, allowing researchers to compare genetic sequences across different species. However, in the context of plant barcoding, large datasets can provide valuable insights into evolutionary processes and species diversity, especially in complex groups with polyploidy and hybridization [115]. This limitation underscores the need for tools that can efficiently calculate sequence divergences and perform analyses without the computational restrictions typical of multiple alignment algorithms [116]. In the realm of metabarcoding, a variety of bioinformatic tools have been evaluated to address the unique challenges posed by high-throughput sequencing data. The recent review underscores the importance of pipelines that can effectively process reference sequence data, with many tools designed to facilitate the transition from raw data to ecological insights [117]. The metabaR package further contributes to this field by providing an R-based solution aimed at evaluating and improving DNA metabarcoding datasets, although it highlights the ongoing need for tools that seamlessly integrate with ecological analysis workflows [118]. In addition, several bioinformatic tools have been developed to address the challenges associated with large barcoding datasets:

SeqTrace is an open-source software which can identify, align, and compute consensus sequences from matching forward and reverse traces, filter low-quality base calls, and end-trim finished sequences [119]. The software features a graphical interface that includes a full-featured chromatogram viewer and sequence editor. However, SeqTrace needs manual assistance in many sequence analysis tasks and can often generate mismatches and gaps in the final consensus sequences, reducing the reliability of the results.

PIPEBAR was later developed to overcome these problems. This pipeline works with Sanger sequencing chromatograms and allows us to run barcode analysis of hundreds of sequences in a fast, accurate, and concise command line [120]. PIPEBAR is faster than other software and can be used to facilitate the submission of barcode sequences to databases such as BOLD and NCBI.

AmpliPiper, a versatile analysis pipeline, is designed explicitly for multilocus amplicon sequencing data generated through third-generation sequencing technologies such as Oxford Nanopore. This tool automates bioinformatics workflows, enabling researchers to manage large datasets effectively and extract meaningful taxonomic and biodiversity insights [95]. Its user-friendly interface and automation capabilities address the challenges posed by high-throughput sequencing, making it suitable for large-scale biodiversity assessments.

DNA Subway is a platform that integrates research-grade bioinformatics software into streamlined workflows, facilitating sequence analysis and data interpretation [121]. This approach simplifies the process for researchers by bundling essential tools into user-friendly pipelines, thereby enhancing accessibility and efficiency. Similarly, the Cogent NGS Analysis Pipeline exemplifies a comprehensive framework capable of handling diverse sequencing data types, including gene, single-cell DNA, and other NGS datasets, with functionalities tailored for gene identification and functional annotation [Cogent™ NGS Analysis Pipeline].

Multiplexing and demultiplexing samples are vital for high-throughput cultivar barcoding. Sample demultiplexing is supported by tools such as those developed by 10× Genomics, which utilize unique DNA barcodes for sample identification. These tools enable efficient separation of pooled samples, ensuring accurate downstream analysis [https://www.10xgenomics.com/analysis-guides/bioinformatics-tools-for-sample-demultiplexing (accessed on 13 July 2025)]. Han et al. [122] developed HycDemux, a hybrid unsupervised approach that improves demultiplexing accuracy by integrating nanopore signals and DNA sequences. This approach outperforms existing tools, especially in complex multi-sample datasets, making it suitable for large-scale cultivar barcoding projects. Similarly, Papetti et al. [123] presented UNPLEX, a framework that operates directly on raw nanopore signals for barcode demultiplexing, further enhancing the efficiency and accuracy of sample identification. Additionally, the development of versatile frameworks like mBARq demonstrates efforts to create user-friendly analysis environments capable of handling barcoded sequencing data, although the current tools often lack the flexibility required for diverse experimental designs [124].

Error correction and consensus sequence generation are critical for accurate cultivar barcoding, given the higher error rates associated with ONT sequencing compared to short-read platforms. Sahlin et al. [125] introduced NGSpeciesID, a software tool designed to produce highly accurate consensus sequences from long-read amplicon data, including ONT reads. This tool minimizes preprocessing and enhances scalability, enabling rapid processing of large sample sets while maintaining high accuracy. Such tools are essential for reliable cultivar identification, as they improve the usability of ONT data for barcode-based discrimination. Rafeie et al. [126] introduced Medlib, a high-performance alignment library for ONT tailored for noisy long-read data, facilitating accurate sequence alignment necessary for barcode analysis. Such tools are vital for ensuring the reliability of ONT-based cultivar barcoding workflows.

For forensic applications, specialized bioinformatic pipelines, such as NGSpeciesID, have been developed to process Nanopore sequencing data for species identification, emphasizing the importance of tailored solutions for specific use cases [127]. The inclusion of profile-hidden Markov models (HMMs) in sequence analysis further enhances the accuracy of barcode data interpretation by filtering pseudogene sequences that could otherwise lead to misleading results [128].

VAREANT is a bioinformatics application designed to streamline the preparation of genomic variant data. It comprises three modules: pre-processing, variant annotation, and AI/ML data preparation. This tool simplifies the workflow for curating targeted variant datasets, making it particularly useful for researchers dealing with large genomic datasets [129]. By enabling efficient data preparation, VAREANT supports the identification of novel biomarkers and the stratification of patients based on disease risk factors.

Taxonize-gb is a command-line software tool designed to filter GenBank non-redundant databases based on taxonomy. This tool allows researchers to create taxa-specific reference databases tailored to their research questions, significantly reducing search times and improving the efficiency of data analysis [130]. By enabling the extraction of relevant sequences, Taxonize-gb enhances the accuracy of species identification in metagenomic studies.

QuasiFlow is another bioinformatic tool that focuses on analyzing genetic variability from NGS data. It is particularly useful for studying quasispecies, which are populations of closely related viral genomes. QuasiFlow can extract reliable mutations and recombinations, even at low frequencies, making it a valuable resource for researchers investigating genetic diversity within species [131]. This capability is essential for understanding evolutionary dynamics and species adaptation.

The superSTR tool offers an alignment-free method for detecting repeat expansions in DNA and RNA sequencing data. This ultrafast tool is capable of efficiently processing large datasets, making it suitable for applications in population genetics [132]. Its ability to identify repeat expansion motifs can complement traditional DNA barcoding approaches by providing additional genetic information relevant to species identification.

Despite the advancements in bioinformatic tools for DNA barcoding analysis, several challenges remain. The complexity of plant genomes, which often contain large amounts of repetitive sequences, can complicate data analysis and interpretation. Additionally, the need for standardized protocols and databases is critical for ensuring the reliability and reproducibility of barcoding results. Furthermore, as sequencing technologies continue to evolve, bioinformatic tools must adapt to handle the increasing complexity and volume of data generated. Future developments in bioinformatics, particularly in machine learning and artificial intelligence, hold great promise for enhancing the analysis of DNA barcoding data. These technologies can improve the accuracy of species identification and facilitate the discovery of new plant species by analyzing vast datasets more efficiently [133].

6. Barcoding Databases and Libraries

The accuracy of DNA barcoding heavily depends on well-curated reference libraries [134]. Their study highlights the necessity of incorporating expert validation and comprehensive metadata to ensure the reliability of large datasets, which in turn influences the performance of bioinformatics tools used for species identification. In 2003, the first DNA barcode research center was established in Canada [135]. Now, there are research services empowered by more than USD 7 million in sequencing and liquid handling instrumentation and more than 20 expertly skilled staff (CCDB. http://dnabarcoding.ca/, accessed on 13 July 2025). The mission of the CCDB is to integrate DNA barcoding technology within their research programs and day-to-day operations. The development of cpDNA genetic databases demonstrates the growing efforts towards standardized data collection and sharing. Nowadays, there are several, widely used DNA barcode databases. The Barcode of Life Data System (BOLD, http://boldsystems.org/, accessed on 13 July 2025) is a central resource for biodiversity science aimed at the acquisition, storage, validation, analysis, and publication of DNA barcodes [136]. However, this platform is mainly used for the identification of species and genera. On the other hand, the platform for intraspecific diversity of local crop collections worldwide is of crucial importance for the sustainability of agricultural systems.

There are several examples of national barcoding libraries. Particularly, in China, a DNA barcode identification system was developed for the identification of herbal plants; it was established based on a two-locus combination of ITS2+psbA–trnH barcodes (http://www.tcmbarcode.cn/china/, accessed on 13 July 2025). In addition, a comprehensive DNA barcoding library for woody plants was developed for the conservation and management of tropical and subtropical forests [137]. The dataset includes a standard barcode library comprising the four most widely used barcodes (rbcL, matK, ITS, and ITS2) for 2520 species from 4654 samples. In the UK, a national DNA barcoding reference library was created based on Sanger sequencing of rbcL, matK, and ITS2, and it contains 1482 plant species. Species-level discrimination was highest with ITS2; however, the ability to successfully retrieve a sequence from herbarium samples was lowest for these loci [2]. In Peru, a DNA barcode reference library of the Lomas region was established based on rbcL, matK, and ITS2 loci. This database provides 1207 plant specimens of 16 Lomas locations in Peru [138]. In Greece, the ‘Greek Vitis Database’ database gathers information about Greek cultivars of Vitis vinifera and highlights a multifaceted approach to cultivar characterization [18].

7. Conclusions

To summarize, the selection of appropriate loci that can discriminate between closely related varieties remains a critical aspect of developing effective identification systems for the cultivars. Based on the results of our review, we propose to use a combination of 3–4 chloroplast loci (trnE-UUC/trnT-GUU, rpl23/rpl2.l, psbA-trnH, trnL-trnF, trnK, rpoC1, ycf1-a, rpl32-trnL, trnH-psbA and matK); this combination has to be selected for each particular crop. The use of ITS region to complete this combination can be useful in some cases. This approach can be efficient for large-scale barcoding efforts. The Sanger sequencing can be a reliable method of amplicon analysis in such experiments. On the other hand, NGS technologies and whole resequencing of chloroplast and mitochondrial genomes can be reliable approaches to develop super barcodes of elite varieties. The development of comprehensive barcode reference libraries for the local germplasm collections is expected to increase the throughput and resolution of the barcoding methods [139]. In addition, automatization and standardization of barcoding protocols are considered as methods to facilitate the large-scale identification and certification of the intraspecific diversity of plants. While the application of barcoding to plant varieties is still evolving, advances in medical barcoding and macroalgal research demonstrate the versatility and expanding scope of DNA-based sample identification methods [140].

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijms26146808/s1.

Funding

The study was carried out as part of the work of the Breeding and Seed Center for Subtropical, Citrus and Nut Crops of FRC SSC RAS, Agreement No. 075-15-2025-190 dated 17 April 2025.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hebert, P.D.N.; Cywinska, A.; Ball, S.L.; de Waard, J.R. Biological identifications through DNA barcodes. Proc. R. Soc. B Biol. Sci. 2003, 270, 313–321. [Google Scholar] [CrossRef]
Jones, L.; Twyford, A.D.; Ford, C.R.; Rich, T.C.G.; Davies, H.; Forrest, L.L.; Hart, M.L.; McHaffie, H.; Brown, M.R.; Hollingsworth, P.M.; et al. Barcode UK: A complete DNA barcoding resource for the flowering plants and conifers of the United Kingdom. Mol. Ecol. Resour. 2021, 21, 2050–2062. [Google Scholar] [CrossRef]
Xiang, X.G.; Hu, H.; Wang, W.; Jin, X.H. DNA barcoding of the recently evolved genus Holcoglossum (Orchidaceae: Aeridinae): A test of DNA barcode candidates. Mol. Ecol. Resour. 2011, 11, 1012–1021. [Google Scholar] [CrossRef] [PubMed]
Abubakar, B.M.; Salleh, F.M.; Omar, M.S.S.; Wagiran, A. DNA Barcoding and Chromatography Fingerprints for the Authentication of Botanicals in Herbal Medicinal Products. Evid.-Based Complement. Altern. Med. 2017, 2017, 1352948. [Google Scholar] [CrossRef] [PubMed]
Shneer, V.S.; Rodionov, A.V. DNA barcodes of plants. Adv. Mod. Biol. 2018, 138, 531–537. [Google Scholar] [CrossRef]
Shekhovtsov, S.V.; Shekhovtsova, I.N.; Peltek, S.E. DNA-barcoding: Methods and approaches. Adv. Mod. Biol. 2019, 139, 211–220. [Google Scholar] [CrossRef]
Savina, N.V.; Kubrak, S.V.; Milko, L.V.; Kilchevsky, A.V.; Nikitina, E.V.; Tozhibaev, K.S. DNA-barcoding as a tool for environmental monitoring and assessment of species diversity of rare plant species. Mol. Appl. Genet. 2020, 29, 25–36. [Google Scholar]
Shadrin, D.M. DNA-barcoding: Areas of application. Genetics 2021, 57, 478–488. [Google Scholar] [CrossRef]
Musou-Yahada, A.; Honjoh, K.I.; Yamamoto, K.; Miyamoto, T.; Ohta, H. Utilization of single nucleotide polymorphism-based allele-specific PCR to identify Shiikuwasha (Citrus depressa Hayata) and Calamondin (Citrus madurensis Lour.) in processed juice. Food Sci. Technol. Res. 2019, 25, 19–27. [Google Scholar] [CrossRef]
Nazar, N.; Saxena, A.; Sebastian, A.; Slater, A.; Sundaresan, V.; Sgamma, T. Integrating DNA Barcoding Within an Orthogonal Approach for Herbal Product Authentication: A Narrative Review. Phytochem. Anal. 2024, 36, 7–29. [Google Scholar] [CrossRef]
DeSalle, R.; Goldstein, P. Review and Interpretation of Trends in DNA Barcoding. Front. Ecol. Evol. 2019, 7, 302. [Google Scholar] [CrossRef]
Kress, W.J.; Wurdack, K.J.; Zimmer, E.A.; Weigt, L.A.; Janzen, D.H. Use of DNA barcodes to identify flowering plants. Proc. Natl. Acad. Sci. USA 2005, 102, 8369–8374. [Google Scholar] [CrossRef] [PubMed]
Hamza, H.; Villa, S.; Torre, S.; Marchesini, A.; Benabderrahim, M.A.; Rejili, M.; Sebastiani, F. Whole mitochondrial and chloroplast genome sequencing of Tunisian date palm cultivars: Diversity and evolutionary relationships. BMC Genom. 2023, 24, 772. [Google Scholar] [CrossRef] [PubMed]
Park, H.S.; Jayakodi, M.; Lee, S.H.; Jeon, J.H.; Lee, H.O.; Park, J.Y.; Moon, B.C.; Kim, C.K.; Wing, R.A.; Newmaster, S.G.; et al. Mitochondrial plastid DNA can cause DNA barcoding paradox in plants. Sci. Rep. 2020, 10, 6112. [Google Scholar] [CrossRef] [PubMed]
Ying, Z.; Awais, M.; Akter, R.; Xu, F.; Baik, S.; Jung, D.; Yang, D.C.; Kwak, G.-Y.; Wenying, Y. Discrimination of Panax ginseng from counterfeits using single nucleotide polymorphism: A focused review. Front. Plant Sci. 2022, 13, 903306. [Google Scholar] [CrossRef]
Xiao, S.; Xu, P.; Deng, Y.; Dai, X.; Zhao, L.; Heider, B.; Zhang, A.; Zhou, Z.; Cao, Q. Comparative analysis of chloroplast genomes of cultivars and wild species of sweetpotato (Ipomoea batatas [L.] Lam). BMC Genom. 2021, 22, 262. [Google Scholar] [CrossRef]
Ekram, A.E.; Hamilton, R.; Campbell, M.; Plett, C.; Kose, S.H.; Russell, J.M.; Stevenson, J.; Coolen, M.J.L. A 1 Ma Record of Climate-induced Vegetation Changes Using Sed ADNA and Pollen in A Biodiversity Hotspot: Lake Towuti, Sulawesi, Indonesia. In Proceedings of the EGU General Assembly 2021, Online, 19–30 April 2021; p. EGU21-4003. [Google Scholar] [CrossRef]
Bibi, A.; Marountas, J.; Kouklinos, Y.; Kafetzopoulos, D.; Lefort, F.; Roubelakis-Angelakis, K. Revitalization of The Greek Vitis Database: A Multimedia Web-backed Genetic Database for Germplasm Management of Vitis Resources in Greece. J. Wine Res. 2021, 32, 1–10. [Google Scholar] [CrossRef]
Dong, W.; Cheng, T.; Li, C.; Xu, C.; Long, P.; Chen, C.; Zhou, S. Discriminating plants using the DNA barcode rbcLb: An appraisal based on a large data set. Mol. Ecol. Resour. 2013, 14, 336–343. [Google Scholar] [CrossRef]
Kress, W.J.; Erickson, D.L. A two-locus global DNA barcode for land plants: The coding rbcL gene complements the non-coding trnH-psbA spacer region. PLoS ONE 2007, 2, e508. [Google Scholar] [CrossRef]
CBOL Plant Working Group. A DNA barcode for land plants. Proc. Natl. Acad. Sci. USA 2009, 106, 12794–12797. [Google Scholar] [CrossRef]
Haider, N.; Wilkinson, M.J. A Set of Plastid DNA-Specific Universal Primers for Flowering Plants. Russ. J. Genet. 2011, 47, 1066–1077. [Google Scholar] [CrossRef]
Corvalán, L.C.J.; de Melo-Ximenes, A.A.; Carvalho, L.R.; e Silva-Neto, C.M.; Diniz-Filho, J.A.F.; de C Telles, M.P.; Nunes, R. Is There a Key Primer for Amplification of Core Land Plant DNA Barcode Regions (rbcL and matK)? Ecol. Evol. 2025, 15, e70961. [Google Scholar] [CrossRef] [PubMed]
China Plant BOL Group; Li, D.Z.; Gao, L.M.; Li, H.T.; Wang, H.; Ge, X.J.; Liu, J.Q.; Chen, Z.D.; Zhou, S.L.; Chen, S.L.; et al. Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants. Proc. Natl. Acad. Sci. USA 2011, 108, 19641–19646. [Google Scholar] [CrossRef] [PubMed]
Dong, W.; Xu, C.; Li, C.; Sun, J.; Zuo, Y.; Shi, S.; Cheng, T.; Guo, J.; Zhou, S. ycf1, the most promising plastid DNA barcode of land plants. Sci. Rep. 2015, 5, 8348. [Google Scholar] [CrossRef] [PubMed]
Samarina, L.S.; Koninskaya, N.G.; Shkhalakhova, R.M.; Simonyan, T.A.; Tsaturyan, G.A.; Shurkina, E.S.; Kulyan, R.V.; Omarova, Z.M.; Tutberidze, T.V.; Ryndin, A.V.; et al. The potential of universal primers for barcoding of subtropical crops: Actinidia, Feijoa, Citrus, and Tea. Int. J. Mol. Sci. 2025; submitted. [Google Scholar]
Dong, W.; Liu, J.; Yu, J.; Wang, L.; Zhou, S. Highly Variable Chloroplast Markers for Evaluating Plant Phylogeny at Low Taxonomic Levels and for DNA Barcoding. PLoS ONE 2012, 7, e35071. [Google Scholar] [CrossRef]
Galimberti, A.; Labra, M.; Sandionigi, A.; Bruno, A.; Mezzasalma, V.; De Mattia, F. DNA Barcoding for Minor Crops and Food Traceability. Adv. Agric. 2014, 2014, 831875. [Google Scholar] [CrossRef]
Sayed, H.A.; Mostafa, S.; Haggag, I.M.; Hassan, N.A. DNA Barcoding of Prunus Species Collection Conserved in the National Gene Bank of Egypt. Mol. Biotechnol. 2023, 65, 410–418. [Google Scholar] [CrossRef]
Pang, X.; Liu, C.; Shi, L.; Liu, R.; Liang, D.; Li, H.; Cherny, S.S.; Chen, S. Utility of the trnH-psbA Intergenic Spacer Region and Its Combinations as Plant DNA Barcodes: A Meta-Analysis. PLoS ONE 2012, 7, e48833. [Google Scholar] [CrossRef]
Feng, S.; Jiao, K.; Zhu, Y.; Wang, H.; Jiang, M.; Wang, H. Molecular identification of species of Physalis (Solanaceae) using a candidate DNA barcode: The chloroplast psbA-trnH intergenic region. Genome 2018, 61, 15–20. [Google Scholar] [CrossRef]
Awad, M.; Fahmy, R.M.; Mosa, K.A.; Helmy, M.; El-Feky, F.A. Identification of effective DNA barcodes for Triticum plants through chloroplast genome-wide analysis. Comput. Biol. Chem. 2017, 71, 20–31. [Google Scholar] [CrossRef] [PubMed]
Jena, B.; Pati, K.; Nedunchezhiyan, M.; Giri, A.K.; Acharya, V. Application of DNA Barcode for Cultivar Identification in Tuber Crops. Biotica Res. Today 2022, 4, 606–608. [Google Scholar]
Badr, A.; Elsherif, N.; Aly, S.; Ibrahim, S.; Ibrahim, M. Genetic Diversity among Selected Medicago sativa Cultivars Using Inter-Retrotransposon-Amplified Polymorphism, Chloroplast DNA Barcodes and Morpho-Agronomic Trait Analyses. Plants 2020, 9, 995. [Google Scholar] [CrossRef]
Jiang, Y.; Zhu, C.; Wang, S.; Wang, F.; Sun, Z. Identification of three cultivated varieties of Scutellaria baicalensis using the complete chloroplast genome as a super-barcode. Sci. Rep. 2023, 13, 5602. [Google Scholar] [CrossRef]
Ma, Y.-P.; Zhao, L.; Zhang, W.-J.; Zhang, Y.-H.; Xing, X.; Duan, X.-X.; Hu, J.; Harris, A.J.; Liu, P.-L.; Dai, S.-L.; et al. Origins of cultivars of Chrysanthemum-Evidence from the chloroplast genome and nuclear LFY gene. J. Syst. Evol. 2020, 58, 925–944. [Google Scholar] [CrossRef]
Al-Andal, A. Unraveling the genetic diversity and evolutionary lineages of Catharanthus roseus cultivars through plastome analysis and DNA barcoding. Crop Pasture Sci. 2025, 76, CP24363. [Google Scholar] [CrossRef]
Attia, O.A.; Ismail, A.I.; El Dessoky, S.D.; Bandar, S.A. Using of DNA-Barcoding, SCoT and SDS-PAGE Protein to Assess Soma-Clonal Variation in Micro-Propagated Fig (Ficus carica L.) Plant. Pak. J. Biol. Sci. 2022, 25, 415–425. [Google Scholar] [CrossRef]
Chen, S.; Yao, H.; Han, J.; Liu, C.; Song, J.; Shi, L.; Zhu, Y.; Ma, X.; Gao, T.; Pang, X.; et al. Validation Of The ITS2 Region As A Novel DNA Barcode For Identifying Medicinal Plant Species. PLoS ONE 2010, 5, e8613. [Google Scholar] [CrossRef]
Li, R.; Dao, Z. Identification of Meconopsis Species By A DNA Barcode Sequence: The Nuclear Internal Transcribed Spacer (ITS) Region of Ribosomal Deoxyribonucleic Acid (DNA). Afr. J. Biotechnol. 2011, 10, 1802–1807. [Google Scholar] [CrossRef]
Dissanayake, U.H.K.; Senevirathna, R.W.K.M.; Ranaweera, L.T.; Wijesundara, W.W.M.U.K.; Jayarathne, H.S.M.; Weebadde, C.K.; Sooriyapathirana, S.D.S.S. Characterization of Cassava (Manihot Esculenta Crantz) Cultivars in Sri Lanka Using Morphological, Molecular and Organoleptic Parameters. Trop. Agric. Res. 2019, 30, 51–70. [Google Scholar] [CrossRef]
Tatlises, M.B.; Hasançebi, S. Identification of Lens Cultivars in Market By Molecular Tools: DNA Barcoding and SSRs. Trak. Univ. J. Nat. Sci. 2023, 24, 91–100. [Google Scholar] [CrossRef]
Dhivya, S.; Ashutosh, S.; Gowtham, I.; Baskar, V.; Harini, A.B.; Mukunthakumar, S.; Sathishkumar, R. Molecular Identification and Evolutionary Relationships Between The Subspecies of Musa By DNA Barcodes. BMC Genom. 2020, 21, 659. [Google Scholar] [CrossRef]
Hidayat, T.; Abdullah, F.I.; Kuppusamy, C.; Samad, A.A.; Wagiran, A. Molecular Identification of Malaysian Pineapple Cultivar Based on Internal Transcribed Spacer Region. APCBEE Procedia 2012, 4, 146–151. [Google Scholar] [CrossRef]
Hürkan, K. Employing Barcode High-Resolution Melting Technique for Authentication of Apricot Cultivars. J. Agric. Sci.-Tarim Bilim. Derg. 2022, 28, 251–258. [Google Scholar] [CrossRef]
Meghana, B.N.; Reshma, S.V. DNA Barcoding of Geographical Indication Tagged Byadagi Chilli and Its Cultivars Using ITS2, MatK and RbcL Coding Sequences. Mol. Biol. Rep. 2025, 52, 286. [Google Scholar] [CrossRef] [PubMed]
Nguyen, T.P.; Do, K.T. Identification of Mango (Mangifera indica L.) cultivars in the Mekong Delta using ISSR markers and DNA barcodes. J. Appl. Biol. Biotechnol. 2025, 13, 68–75. [Google Scholar] [CrossRef]
Alzahrani, O.F.; Dguimi, H.M.; Alshaharni, M.O.; Albalawi, D.; Zaoui, S. Employing plant DNA barcodes for pomegranate species identification in Al-Baha Region, Saudi Arabia. J. Umm Al-Qura Univ. Appl. Sci. 2024, 10, 136–144. [Google Scholar] [CrossRef]
Castro, C.; Hernandez, A.; Alvarado, L.; Flores, D. DNA Barcodes in Fig Cultivars (Ficus carica L.) Using ITS Regions of Ribosomal DNA, the psbA-trnH Spacer and the matK Coding Sequence. Am. J. Plant Sci. 2015, 6, 95–102. [Google Scholar] [CrossRef]
Kjærandsen, J. Current State of DNA Barcoding of Sciaroidea (Diptera)—Highlighting the Need to Build the Reference Library. Insects 2022, 13, 147. [Google Scholar] [CrossRef]
Ho, V.T.; Tu, N.T.; Nguyen, T.A. DNA fingerprinting and molecular characterization of mango (Mangifera spp.) cultivars in Vietnam using ITS DNA barcode. Bulg. J. Agric. Sci. 2021, 27, 128. [Google Scholar]
Gogoi, B.; Wann, S.B.; Saikia, S.P. DNA barcodes for delineating Clerodendrum species of North East India. Sci. Rep. 2020, 10, 13490. [Google Scholar] [CrossRef]
Wang, J.; Yan, Z.; Zhong, P.; Shen, Z.; Yang, G.; Ma, L. Screening of universal DNA barcodes for identifying grass species of Gramineae. Front. Plant Sci. 2022, 13, 998863. [Google Scholar] [CrossRef] [PubMed]
Raskoti, B.B.; Ale, R. DNA barcoding of medicinal orchids in Asia. Sci. Rep. 2021, 11, 23651. [Google Scholar] [CrossRef] [PubMed]
Kumar, J.; Choudhary, K.; Shelja, A.; Singh, H.; Kumar, A.; Bagga, P. Molecular Approaches for Authentication and Identification of Medicinal Plants. Plant Mol. Biol. Rep. 2025. [Google Scholar] [CrossRef]
Said, E.M.; Hassan, M.E. DNA barcodes in Egyptian olive cultivars (Olea europaea L.) using the rbcL and matK coding sequences. J. Crop Sci. Biotechnol. 2023, 26, 447–456. [Google Scholar] [CrossRef]
Boudadi, I.; Lachheb, M.; Merzougui, S.E.; Lachguer, K.; Serghini, M.A. Exploring genetic variation in saffron (Crocus sativus L.) accessions through two-locus DNA barcoding. Ind. Crops Prod. 2024, 220, 119177. [Google Scholar] [CrossRef]
Abouseada, H.H.; Mohamed, A.-S.H.; Teleb, S.S.; Bard, A.; Tantawy, M.E.; Ibrahim, S.D.; Ellmouni, F.Y.; Ibrahim, M. Genetic diversity analysis in wheat cultivars using SCoT and ISSR markers, chloroplast DNA barcoding and grain SEM. BMC Plant Biol. 2023, 23, 193. [Google Scholar] [CrossRef]
Zhang, M.; Zhai, X.; He, L.; Wang, Z.; Cao, H.; Wang, P.; Ren, W.; Ma, W. Morphological description and DNA barcoding research of nine Syringa species. Front. Genet. 2025, 16, 1544062. [Google Scholar] [CrossRef]
Chen, Y.; Zhu, X.; Loukopoulos, P.; Weston, L.; Albrecht, D.E.; Quinn, J.C. Genotypic identification of Panicum spp. in New South Wales, Australia using DNA barcoding. Sci. Rep. 2021, 11, 16055. [Google Scholar] [CrossRef]
Aljuaid, B.S.; Ismail, I.A.; Attia, A.O.; Dessoky, D.S. Genetic Stability of in vitro Propagated Grapevine (Vitis vinifera L.) cv. Al-Bayadi. J. Agric. Crops 2022, 8, 12–19. [Google Scholar] [CrossRef]
Letsiou, S.; Madesis, P.; Vasdekis, E.; Montemurro, C.; Grigoriou, M.E.; Skavdis, G.; Moussis, V.; Koutelidakis, A.E.; Tzakos, A.G. DNA Barcoding as a Plant Identification Method. Appl. Sci. 2024, 14, 1415. [Google Scholar] [CrossRef]
Wang, P.; Zhang, J.; Zhang, Y.; Su, H.; Qiu, X.; Gong, L.; Huang, J.; Bai, J.; Huang, Z.; Xu, W. Chemical and genetic discrimination of commercial Guangchenpi (Citrus reticulata ‘Chachi’) by using UPLC-QTOF-MS/MS based metabolomics and DNA barcoding approaches. RSC Adv. 2019, 9, 23373–23381. [Google Scholar] [CrossRef]
Fomina, N.A.; Antonova, O.Y.; Chukhina, I.G.; Gimaeva, E.A.; Stashevski, Z.; Gavrilenko, T.A. Nomenclatural Standards and Genetic Passports of Potato Cultivars Bred By The Tatar Research Institute of Agriculture «Kazan Scientific Center of The Russian Academy of Sciences». Plant Biotechnol. Breed. 2020, 3, 55–67. [Google Scholar] [CrossRef]
Kress, W.J. Plant DNA barcodes: Applications today and in the future. J. Syst. Evol. 2017, 55, 291–307. [Google Scholar] [CrossRef]
Barcaccia, G.; Palumbo, F.; Scariolo, F.; Vannozzi, A.; Borin, M.; Bona, S. Potentials and Challenges of Genomics for Breeding Cannabis Cultivars. Front. Plant Sci. 2020, 11, 573299. [Google Scholar] [CrossRef] [PubMed]
Cheng, T.; Xu, C.; Lei, L.; Li, C.; Zhang, Y.; Zhou, S. Barcoding the kingdom Plantae: New PCR primers for ITS regions of plants with improved universality and specificity. Mol. Ecol. Resour. 2016, 16, 138–149. [Google Scholar] [CrossRef]
de Vere, N.; Rich, T.C.; Ford, C.R.; Trinder, S.A.; Long, C.; Moore, C.W.; Satterthwaite, D.; Davies, H.; Allainguillaume, J.; Ronca, S.; et al. DNA barcoding the native flowering plants and conifers of Wales. PLoS ONE 2012, 7, e37945. [Google Scholar] [CrossRef]
Kress, W.J.; Erickson, D.L.; Jones, F.A.; Swenson, N.G.; Perez, R.; Sanjur, O.; Bermingham, E. Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama. Proc. Natl. Acad. Sci. USA 2009, 106, 18621–18626. [Google Scholar] [CrossRef]
Fazekas, A.J.; Burgess, K.S.; Kesanakurti, P.R.; Graham, S.W.; Newmaster, S.G.; Husband, B.C.; Percy, D.M.; Hajibabaei, M.; Barrett, S.C.H. Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PLoS ONE 2008, 3, e2802. [Google Scholar] [CrossRef]
Fay, M.F.; Swensen, S.M.; Chase, M.W. Taxonomic affinities of Medusagyne oppositifolia (Medusagynaceae). Kew Bull. 1997, 52, 111–120. [Google Scholar] [CrossRef]
Ford, C.S.; Ayres, K.L.; Toomey, N.; Haider, N.; Van Alphen Stahl, J.; Kelly, L.J.; Wikström, N.; Hollingsworth, P.M.; Duff, R.J.; Hoot, S.B.; et al. Selection of candidate coding DNA barcoding regions for use on land plants. Bot. J. Linn. Soc. 2009, 159, 1–11. [Google Scholar] [CrossRef]
de Vere, N.; Rich, T.C.; Trinder, S.A.; Long, C. DNA barcoding for plants. In Plant Genotyping: Methods and Protocols; Batley, J., Ed.; Humana Press: New York, NY, USA, 2015; pp. 101–118. [Google Scholar] [CrossRef]
Cuénoud, P.; Savolainen, V.; Chatrou, L.W.; Powell, M.; Grayer, R.J.; Chase, M.W. Molecular phylogenetics of Caryophyllales based on nuclear 18S rDNA and plastid rbcL, atpB, and matK DNA sequences. Am. J. Bot. 2002, 89, 132–144. [Google Scholar] [CrossRef] [PubMed]
Li, L.; Hu, Y.; He, M.; Zhang, B.; Wu, W.; Cai, P.; Huo, D.; Hong, Y. Comparative chloroplast genomes: Insights into the evolution of the chloroplast genome of Camellia sinensis and the phylogeny of Camellia. BMC Genom. 2021, 22, 138. [Google Scholar] [CrossRef] [PubMed]
Wolff, K.; Zietkiewicz, E.; Hofstra, H. Identification of chrysanthemum cultivars and stability of DNA fingerprint patterns. Theor. Appl. Genet. 1995, 91, 439–447. [Google Scholar] [CrossRef]
Caetano-Anollés, G.; Bassam, B.J.; Gresshoff, P.M. Enhanced detection of polymorphic DNA by multiple arbitrary amplicon profiling of endonuclease-digested DNA: Identification of markers tightly linked to the supernodulation locus in soybean. Molec. Gen. Genet. 1993, 241, 57–64. [Google Scholar] [CrossRef]
Herten, K.; Hestand, M.S.; Vermeesch, J.R.; Van Houdt, J.K.J. GBSX: A toolkit for experimental design and demultiplexing genotyping by sequencing experiments. BMC Bioinform. 2015, 16, 73. [Google Scholar] [CrossRef]
Muleo, R.; Colao, M.C.; Miano, D.; Cirilli, M.; Intrieri, M.C.; Baldoni, L.; Rugini, E. Mutation scanning and genotyping by high-resolution DNA melting analysis in olive germplasm. Genome 2009, 52, 252–260. [Google Scholar] [CrossRef][Green Version]
Jaakola, L.; Suokas, M.; Häggman, H. Novel approaches based on DNA barcoding and high-resolution melting of amplicons for authenticity analyses of berry species. Food Chem. 2010, 123, 494–500. [Google Scholar] [CrossRef]
Bosmali, I.; Ganopoulos, I.; Madesis, P.; Tsaftaris, A. Microsatellite and DNA-barcode regions typing combined with high resolution melting (HRM) analysis for food forensic uses: A case study on lentils (Lens culinaris). Food Res. Int. 2012, 46, 141–147. [Google Scholar] [CrossRef]
Madesis, P.; Ganopoulos, I.; Anagnostis, A.; Tsaftaris, A. The application of Bar-HRM (Barcode DNA-High Resolution Melting) analysis for authenticity testing and quantitative detection of bean crops (Leguminosae) without prior DNA purification. Food Control 2012, 25, 576–582. [Google Scholar] [CrossRef]
Zeng, H.; Huang, B.; Xu, L.; Wu, Y. Banana Classification Using Sanger Sequencing of the Ribosomal DNA Internal Transcribed Spacer (ITS) Region. Plants 2024, 13, 2173. [Google Scholar] [CrossRef]
Fernandes, T.J.R.; Amaral, J.S.; Mafra, I. DNA Barcode Markers Applied to Seafood Authentication: An Updated Review. Crit. Rev. Food Sci. Nutr. 2020, 61, 3904–3935. [Google Scholar] [CrossRef] [PubMed]
Richardson, M.A.; Nenadic, N.; Wingfield, M.; McDougall, C. The Development of Multiplex PCR As-says for The Rapid Identification of Multiple Saccostrea Species, and Their Practical Applications in Restoration and Aquaculture. BMC Ecol. Evol. 2024, 24, 67. [Google Scholar] [CrossRef]
Pandit, R.; Travadi, T.; Sharma, S.; Joshi, C.; Joshi, M. DNA Meta-barcoding Using rbcL Based Mini-barcode Revealed Presence of Unspecified Plant Species in Ayurvedic Polyherbal Formulations. Phytochem. Anal. 2021, 32, 804–810. [Google Scholar] [CrossRef] [PubMed]
Shokralla, S.; Gibson, J.F.; Nikbakht, H.; Janzen, D.H.; Hallwachs, W.; Hajibabaei, M. Next-generation DNA barcoding: Using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens. Mol. Ecol. Resour. 2014, 14, 892–901. [Google Scholar] [CrossRef]
Antil, S.; Abraham, J.S.; Sripoorna, S.; Maurya, S.; Dagar, J.; Makhija, S.; Bhagat, P.; Gupta, R.; Sood, U.; Lal, R.; et al. DNA Barcoding, An Effective Tool for Species Identification: A Review. Mol. Biol. Rep. 2023, 50, 761–775. [Google Scholar] [CrossRef]
Jia, Y.; Zhao, S.; Guo, W.; Peng, L.; Zhao, F.; Wang, L.; Fan, G.; Zhu, Y.; Xu, D.; Liu, G.; et al. Sequencing Introduced False Positive Rare Taxa Lead to Biased Microbial Community Diversity, Assembly, and Interaction Interpretation in Amplicon Studies. Environ. Microbiome 2022, 17, 43. [Google Scholar] [CrossRef]
Urra, C.; Sanhueza, D.; Pavez, C.; Tapia, P.; Núñez-Lillo, G.; Minio, A.; Miossec, M.; Blanco-Herrera, F.; Gainza, F.; Castro, A.; et al. Identification of Grapevine Clones Via High-throughput Amplicon Sequencing: A Proof-of-concept Study. G3 Genes Genomes Genet. 2023, 13, jkad145. [Google Scholar] [CrossRef]
Hawliczek, A.; Bolibok, L.; Tofil, K.; Borzęcka, E.; Jankowicz-Cieślak, J.; Gawroński, P.; Kral, A.; Till, B.J.; Bolibok-Brągoszewska, H. Deep sampling and pooled amplicon sequencing reveals hidden genic variation in heterogeneous rye accessions. BMC Genom. 2020, 21, 845. [Google Scholar] [CrossRef]
Samarina, L.; Fedorina, J.; Kuzmina, D.; Malyukova, L.; Manakhova, K.; Kovalenko, T.; Matskiv, A.; Xia, E.; Tong, W.; Zhang, Z.; et al. Analysis of Functional Single-Nucleotide Polymorphisms (SNPs) and Leaf Quality in Tea Collection under Nitrogen-Deficient Conditions. Int. J. Mol. Sci. 2023, 24, 14538. [Google Scholar] [CrossRef]
Whitford, W.; Hawkins, V.; Moodley, K.; Grant, M.J.; Lehnert, K.; Snell, R.G.; Jacobsen, J.C. Optimised multiplex amplicon sequencing for mutation identification using the MinION nanopore sequencer. bioRxiv 2021. [Google Scholar] [CrossRef]
Whitford, W.; Hawkins, V.; Moodley, K.S.; Grant, M.J.; Lehnert, K.; Snell, R.G.; Jacobsen, J.C. Proof of concept for multiplex amplicon sequencing for mutation identification using the MinION nanopore sequencer. Sci. Rep. 2022, 12, 8572. [Google Scholar] [CrossRef] [PubMed]
Bertelli, A.; Steindl, S.; Kirchner, S.; Schwahofer, P.; Haring, E.; Szucsich, N.; Kruckenhauser, L.; Kapun, M. AmpliPiper: A versatile amplicon-seq analysis tool for multilocus DNA barcoding. bioRxiv 2024. [Google Scholar] [CrossRef]
Uitdewilligen, J.G.; Wolters, A.M.; D’hoop, B.B.; Borm, T.J.; Visser, R.G.; van Eck, H.J. Correction: A next-generation sequencing method for genotyping-by-sequencing of highly heterozygous autotetraploid potato. PLoS ONE 2015, 10, e0141940. [Google Scholar] [CrossRef] [PubMed]
Niimi, E.; Fujii, H.; Ohta, S.; Iwakura, T.; Endo, T.; Shimada, T. Development of Acid Citrus Cultivar Identification System By SNP Markers. Hortic. Res. 2022, 21, 111–122. [Google Scholar] [CrossRef]
Park, J.Y.; Jang, Y.J.; Jung, J.K.; Shim, E.J.; Sim, S.C.; Chung, S.M.; Lee, G.P. Development of SNP Markers for The Identification of Commercial Korean Watermelon Cultivars Using Fluidigm Genotyping Analysis. Korean J. Hortic. Sci. Technol. 2022, 40, 75–84. [Google Scholar] [CrossRef]
Meng, J.; Zhang, Y.; Wei, Y.; Li, R.; Li, Z.; Zhong, C. Identification of Commercial Cultivars in the Tabebuia Alliance Using Genotyping-by-Sequencing. Forests 2023, 14, 271. [Google Scholar] [CrossRef]
Scariolo, F.; Palumbo, F.; Vannozzi, A.; Sacilotto, G.B.; Gazzola, M.; Barcaccia, G. Genotyping Analysis by RAD-Seq Reads Is Useful to Assess the Genetic Identity and Relationships of Breeding Lines in Lavender Species Aimed at Managing Plant Variety Protection. Genes 2021, 12, 1656. [Google Scholar] [CrossRef]
Yu, Q.; Ling, Y.; Xiong, Y.; Zhao, W.; Xiong, Y.; Dong, Z.; Yang, J.; Zhao, J.; Zhang, X.; Ma, X. RAD-seq as an effective strategy for heterogenous variety identification in plants–a case study in Italian ryegrass (Lolium multiflorum). BMC Plant Biol. 2022, 22, 231. [Google Scholar] [CrossRef]
Shen, Y.; Yao, G.; Li, Y.; Tian, X.; Li, S.; Wang, N.; Zhang, C.; Wang, F.; Ma, Y. RAD-seq data reveals robust phylogeny and morphological evolutionary history of Rhododendron. Hortic. Plant J. 2024, 10, 866–878. [Google Scholar] [CrossRef]
Li, H.; Ikram, M.; Xia, Y.; Li, R.; Yuan, Q.; Zhao, W.; Siddique, K.H.M.; Guo, P. Genome-wide identification and development of InDel markers in tobacco (Nicotiana tabacum L.) using RAD-seq. Physiol. Mol. Biol. Plants 2022, 28, 1077–1089. [Google Scholar] [CrossRef] [PubMed]
Casanova, A.; Maroso, F.; Blanco, A.; Hermida, M.; Ríos, N.; García, G.; Manuzzi, A.; Zane, L.; Verissimo, A.; García-Marín, J.L.; et al. Low impact of different SNP panels from two building-loci pipelines on RAD-Seq population genomic metrics: Case study on five diverse aquatic species. BMC Genom. 2021, 22, 150. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Guo, P.; Xu, Q.; Du, W. Cotton Seed Cultivar Identification Based on the Fusion of Spectral and Textural Features. PLoS ONE 2024, 19, e0303219. [Google Scholar] [CrossRef] [PubMed]
Blazakis, K.N.; Stupichev, D.; Kosma, M.; El Chami, M.A.H.; Apodiakou, A.; Kostelenos, G.; Kalaitzis, P. Discrimination of 14 olive cultivars using morphological analysis and machine learning algorithms. Front. Plant Sci. 2024, 15, 1441737. [Google Scholar] [CrossRef]
Wahyuni, D.; Dwiyanti, F.G.; Pratama, R.; Majiidu, M.; Rachmat, H.H.; Siregar, I.Z. Chloroplast Genome Draft of Dryobalanops aromatica Generated Using Oxford Nanopore Technology and Its Potential Application for Phylogenetic Study. Forests 2021, 12, 1515. [Google Scholar] [CrossRef]
Aury, J.M.; Engelen, S.; Istace, B.; Monat, C.; Lasserre-Zuber, P.; Belser, C.; Cruaud, C.; Rimbert, H.; Leroy, P.; Arribatet, S.; et al. Long-read and chromosome-scale assembly of the hexaploid wheat genome achieves high resolution for research and breeding. GigaScience 2021, 11, giac034. [Google Scholar] [CrossRef]
Song, B.; Ning, W.; Wei, D.; Jiang, M.; Zhu, K.; Wang, X.; Edwards, D.; Odeny, D.A.; Cheng, S. Plant genome resequencing and population genomics: Current status and future prospects. Mol. Plant 2023, 16, 1252–1268. [Google Scholar] [CrossRef]
Wee, Y.; Bhyan, S.B.; Liu, Y.; Lu, J.; Li, X.; Zhao, M. The bioinformatics tools for the genome assembly and analysis based on third-generation sequencing. Brief. Funct. Genomics 2019, 18, 1–12. [Google Scholar] [CrossRef]
Duan, Y.; Wang, Y.; Ding, W.; Wang, C.; Meng, L.; Meng, J.; Chen, N.; Liu, Y.; Xing, S. Comparative and phylogenetic analysis of the chloroplast genomes of four commonly used medicinal cultivars of Chrysanthemums morifolium. BMC Plant Biol. 2024, 24, 992. [Google Scholar] [CrossRef]
Zhang, J.; Liu, H.; Xu, W.; Wan, X.; Zhu, K. Comparative analysis of chloroplast genome of Lonicera japonica cv. Damaohua. Open Life Sci. 2024, 19, 20220984. [Google Scholar] [CrossRef]
Lee, D.J.; Kim, C.K.; Lee, T.H.; Lee, S.J.; Moon, D.G.; Kwon, Y.H.; Cho, J.Y. The complete chloroplast genome sequence of economical standard tea plant, Camellia sinensis L. cultivar Sangmok, in Korea. Mitochondrial DNA B Resour. 2020, 5, 2835–2836. [Google Scholar] [CrossRef]
Vieira, M.B.; Faustino, M.V.; Lourenço, T.F.; Oliveira, M.M. DNA-Based Tools to Certify Authenticity of Rice Varieties—An Overview. Foods 2022, 11, 258. [Google Scholar] [CrossRef]
Wang, X.; Gussarova, G.; Ruhsam, M.; de Vere, N.; Metherell, C.; Hollingsworth, P.M.; Twyford, A.D. DNA barcoding British Euphrasia reveals deeply divergent polyploids but lack of species-level resolution. AoB PLANTS 2018, 10, ply026. [Google Scholar] [CrossRef]
Steinke, D.; Vences, M.; Salzburger, W.; Meyer, A. TaxI: A software tool for DNA barcoding using distance methods. Philos. Trans. R. Soc. B 2005, 360, 1975–1980. [Google Scholar] [CrossRef] [PubMed]
Hakimzadeh, A.; Asbun, A.A.; Albanese, D.; Bernard, M.; Buchner, D.; Callahan, B.; Caporaso, J.G.; Curd, E.; Djemiel, C.; Durling, M.B.; et al. A pile of pipelines: An overview of the bioinformatics software for metabarcoding data analyses. Mol. Ecol. Resour. 2024, 24, e13847. [Google Scholar] [CrossRef] [PubMed]
Zinger, L.; Lionnet, C.; Benoiston, A.-S.; Donald, J.; Mercier, C.; Boyer, F. metabaR: An R package for the evaluation and improvement of DNA metabarcoding data quality. Methods Ecol. Evol. 2021, 12, 586–592. [Google Scholar] [CrossRef]
Stucky, B.J. SeqTrace: A graphical tool for rapidly processing DNA sequencing chromatograms. J. Biomol. Tech. 2012, 23, 90–93. [Google Scholar] [CrossRef]
Oliveira, R.R.M.; Nunes, G.L.; de Lima, T.G.L.; Oliveira, G.; Alves, R. PIPEBAR and OverlapPER: Tools for a fast and accurate DNA barcoding analysis and paired-end assembly. BMC Bioinform. 2018, 19, 297. [Google Scholar] [CrossRef]
Williams, J.; Nash, B.; Ghiban, C.; Khalfan, M.; Hilgert, U.; Lauter, S.; Yang, C.; Micklos, D.A. Analysis of DNA Barcodes Using DNA Subway. In DNA Barcoding. Methods in Molecular Biology; DeSalle, R., Ed.; Humana Press: New York, NY, USA, 2024; Volume 2744, pp. 551–560. [Google Scholar] [CrossRef]
Han, R.; Qi, J.; Xue, Y.; Sun, X.; Zhang, F.; Gao, X.; Li, G. HycDemux: A hybrid unsupervised approach for accurate barcoded sample demultiplexing in nanopore sequencing. Genome Biol. 2023, 24, 222. [Google Scholar] [CrossRef]
Papetti, D.M.; Spolaor, S.; Nazari, I.; Tirelli, A.; Leonardi, T.; Caprioli, C.; Besozzi, D.; Vlachou, T.; Pelicci, P.G.; Cazzaniga, P.; et al. Barcode Demultiplexing of Nanopore Sequencing Raw Signals by Unsupervised Machine Learning. Front. Bioinform. 2023, 3, 1067113. [Google Scholar] [CrossRef]
Sintsova, A.; Ruscheweyh, H.J.; Field, C.M.; Feer, L.; Nguyen, B.D.; Daniel, B.; Hardt, W.D.; Vorholt, J.A.; Sunagawa, S. mBARq: A versatile and user-friendly framework for the analysis of DNA barcodes from transposon insertion libraries, knockout mutants, and isogenic strain populations. Bioinformatics 2024, 40, btae078. [Google Scholar] [CrossRef]
Sahlin, K.; Lim, M.C.W.; Prost, S. NGSpeciesID: DNA barcode and amplicon consensus generation from long-read sequencing data. Ecol. Evol. 2021, 11, 1392–1398. [Google Scholar] [CrossRef] [PubMed]
Rafeie, M.; Vafaee, F.; Faridani, O.R. Medlib: A feature-rich C/C++ library for exact alignment of nanopore sequences using multiple edit distance. bioRxiv 2025. [Google Scholar] [CrossRef]
Vasiljevic, N.; Lim, M.; Humble, E.; Seah, A.; Kratzer, A.; Morf, N.V.; Prost, S.; Ogden, R. Developmental Validation of Oxford Nanopore Technology MinION Sequence Data and the NGSpeciesID Bioinformatic Pipeline for Forensic Genetic Species Identification. Forensic Sci. Int. Genet. 2021, 53, 102493. [Google Scholar] [CrossRef] [PubMed]
Porter, T.M.; Hajibabaei, M. Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets. BMC Bioinform. 2021, 22, 256. [Google Scholar] [CrossRef]
Narayanan, R.; DeGroat, W.; Peker, E.; Zeeshan, S.; Ahmed, Z. VAREANT: A Bioinformatics Application for Gene Variant Reduction and Annotation. Bioinform. Adv. 2025, 5, vbae210. [Google Scholar] [CrossRef]
Sarhan, M.S.; Filosi, M.; Maixner, F.; Fuchsberger, C. Taxonize-gb: A tool for filtering GenBank non-redundant databases based on taxonomy. bioRxiv 2024. [Google Scholar] [CrossRef]
Seoane, P.; Diaz-Martinez, L.; Viguera, E.; Claros, M.G.; Grande-Perez, A. QuasiFlow: A bioinformatic tool for genetic variability analysis from next generation sequencing data. bioRxiv 2022. [Google Scholar] [CrossRef]
Fearnley, L.G.; Bennett, M.F.; Bahlo, M. Ultrafast, alignment-free detection of repeat expansions in next-generation DNA and RNA sequencing data. bioRxiv 2021. [Google Scholar] [CrossRef]
Koh, E.; Sunil, R.S.; Lam, H.Y.I.; Mutwil, M. Confronting the data deluge: How artificial intelligence can be used in the study of plant stress. Comput. Struct. Biotechnol. J. 2024, 23, 3454–3466. [Google Scholar] [CrossRef]
Salvi, D.; Berrilli, E.; D’Alessandro, P.; Biondi, M. Sharpening the DNA barcoding tool through a posteriori taxonomic validation: The case of Longitarsus flea beetles (Coleoptera: Chrysomelidae). PLoS ONE 2020, 15, e0233573. [Google Scholar] [CrossRef]
Yu, J.; Wu, X.; Liu, C.; Newmaster, S.; Ragupathy, S.; Kress, W.J. Progress in the use of DNA barcodes in the identification and classification of medicinal plants. Ecotoxicol. Environ. Saf. 2021, 208, 111691. [Google Scholar] [CrossRef]
Ratnasingham, S.; Wei, C.; Chan, D.; Agda, J.; Ballesteros-Mejia, L.; Ait Boutou, H.; Boutou, H.A.; El Bastami, Z.M.; Ma, E.; Manjunathet, R.; et al. BOLD v4: A Centralized Bioinformatics Platform for DNA-Based Biodiversity Data. In DNA Barcoding: Methods and Protocols; Springer: New York, NY, USA, 2024; pp. 403–441. [Google Scholar]
Jin, L.; Shi, H.Y.; Li, T.; Zhao, N.; Xu, Y.; Xiao, T.W.; Song, F.; Ma, C.X.; Li, Q.M.; Lin, L.X.; et al. A DNA barcode library for woody plants in tropical and subtropical China. Sci. Data 2023, 10, 819. [Google Scholar] [CrossRef]
Song, F.; Deng, Y.F.; Yan, H.F.; Lin, Z.L.; Delgado, A.; Trinidad, H.; Gonzales-Arce, P.; Riva, S.; Cano-Echevarría, A.; Ramos, E.; et al. Flora diversity survey and establishment of a plant DNA barcode database of Lomas ecosystems in Peru. Sci. Data 2023, 10, 294. [Google Scholar] [CrossRef]
Grant, D.M.; Brodnicke, O.B.; Evankow, A.M.; Ferreira, A.O.; Fontes, J.T.; Hansen, A.K.; Jensen, M.R.; Kalaycı, T.E.; Leeper, A.; Patil, S.K.; et al. The Future of DNA Barcoding: Reflections from Early Career Researchers. Diversity 2021, 13, 313. [Google Scholar] [CrossRef]
Bartolo, A.G.; Zammit, G.; Peters, A.F.; Küpper, F.C. The current state of DNA barcoding of macroalgae in the Mediterranean Sea: Presently lacking but urgently required. Bot. Mar. 2020, 63, 253–272. [Google Scholar] [CrossRef]

Figure 1. Key stages of DNA barcoding of cultivars in the local germplasm collections: selection of loci and primers, amplificons sequencing, alignments, data analysis to reveal genotype-specific (unique) SNPs and InDels, check the specificity on broad range of intraspecific diversity, develop genetic passport of the cultivar, create DNA-library/database.

Figure 2. NCBI Primer blast results on universal chloroplast primers and ITS primers. Top barplot—the number of accessions in NCBI database, Bottom barplot—the length of amplified product (min and max) for each primer pair.

Figure 3. Application of plant DNA barcoding and the most frequently used DNA barcode loci for genotype-level identification of valuable plant crops. The left scheme presents the levels of plant identification using DNA barcoding and application of the results. The right scheme represents the most efficiently used loci and combinations for genotype intraspecific genotype-level identification.

Table 1. Universal primers for DNA-barcoding of plant species.

Primer Code	Forward/Reverse	Sequence 5′-3′	Reference
rbcLa-F	F	ATGTCACCACAAACAGAGACTAAAGC	[20]
rbcLr590	R	AGTCCACCGCGTAGACATTCAT	[68]
rbcLa-rev	R	GTAAAATCAAGTCCACCRCG	[69]
rbcLajf634R	R	GAAACGGTCTCTCCAACGCAT	[70]
rbcL724R	R	TCGCATGTACCTGCAGTAGC	[71]
matK2.1a	F	ATCCATCTGGAAATCTTAGTTC	[72]
matK2.1F	F	CCTATCCATCTGGAAATCTTAG	[72]
matK_1R_kim	F	ACCCAGTCCATCTGGAAATCTTGGTCC	[73]
MatK_390f	F	CGATCTATTCATTCAATATTTC	[74]
MatK_Xf	F	TAATTTACGATCAATTCATTC	[72]
MatK-3FKIM-r	R	CGTACAGTACTTTTGTGTTTACGAG	[73]
MatK_1326r	R	TCTAGCACACGAAAGTCGAAGT	[74]
MatK_5r	R	GTTCTAGCACAAGAAAGTCG	[72]
matK3.2	R	CTTCCTCTGTAAAGAATTC	[72]
23S,4.5S/5S	F	TCTCCTCCGACTTCCCTAG	[22]
23S,4.5S/5S	R	ACCATGAACGAGGAAAGGC	[22]
16S	F	ATTGCGTCGTTGTGCCTGG	[22]
16S	R	GATACGTTGTTAGGTGCTCC	[22]
petB/petD	F	TAGGGGGAATTACACTTAC	[22]
petB/petD	R	CATTAACATGAATACGGCAG	[22]
rpl23/rpl2.l	F	GAAGAAGCTTGTACAGTTTGG	[22]
rpl23/rpl2.l	R	TTTACTTACGGCGACGAAG	[22]
rpl2 intron	F	ATTGAGTTCAGTAGTTCCTC	[22]
rpl2 intron	R	CCAAACTGTACAAGCTTCTTC	[22]
rpoC1 intron	F	GAGTAACATGAAGCTCAG	[22]
rpoC1 intron	R	GTTTCCTTTCATCCGGCT	[22]
trnK intron	F	GTCTACATCATCGGTAGAG	[22]
trnK intron	R	CAACCCAATCGCTCTTTTG	[22]
trnE-UUC/trnT-GUU	F	TCCTGAACCACTAGACGATG	[75]
trnE-UUC/trnT-GUU	R	ATGGCGTTACTCTACCACTG	[75]
ITS-p5	F	CCTTATCAYTTAGAGGAAGGAG	[67]
ITS-p3	F	YGACTCTCGGCAACGGATA	[67]
ITS-u4	R	RGTTTCTTTTCCTCCGCTTA	[67]
ITS-u2	R	GCGTTCAAAGAYTCGATGRTTC	[67]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Samarina, L.S.; Koninskaya, N.G.; Shkhalakhova, R.M.; Simonyan, T.A.; Kuzmina, D.O. DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops. Int. J. Mol. Sci. 2025, 26, 6808. https://doi.org/10.3390/ijms26146808

AMA Style

Samarina LS, Koninskaya NG, Shkhalakhova RM, Simonyan TA, Kuzmina DO. DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops. International Journal of Molecular Sciences. 2025; 26(14):6808. https://doi.org/10.3390/ijms26146808

Chicago/Turabian Style

Samarina, Lidiia S., Natalia G. Koninskaya, Ruset M. Shkhalakhova, Taisiya A. Simonyan, and Daria O. Kuzmina. 2025. "DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops" International Journal of Molecular Sciences 26, no. 14: 6808. https://doi.org/10.3390/ijms26146808

APA Style

Samarina, L. S., Koninskaya, N. G., Shkhalakhova, R. M., Simonyan, T. A., & Kuzmina, D. O. (2025). DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops. International Journal of Molecular Sciences, 26(14), 6808. https://doi.org/10.3390/ijms26146808

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DNA-Barcoding for Cultivar Identification and Intraspecific Diversity Analysis of Agricultural Crops

Abstract

1. Introduction

2. Selection of an Appropriate Loci

2.1. Mitochondrial DNA for Cultivars Identification

2.2. Chloroplast DNA for Cultivars Identification

2.3. Nuclear DNA for Cultivars Identification

2.4. Multilocus DNA Barcoding Systems for Cultivars Identification

3. Selection of Primers

4. Strategies of Amplicon Analysis

4.1. Restriction Analysis

4.2. High Resolution Melting for DNA Barcoding of Cultivars

4.3. Sanger Sequencing for Amplicon Analysis

4.4. Next Generation Sequencing (NGS) for Cultivars Identification

4.4.1. Pooled Amplicon Sequencing for Cultivar Identification

4.4.2. Genotyping-by-Sequencing for Cultivars Identification

4.4.3. RAD-Seq Approach for Cultivar Identification

4.4.4. Oxford Nanopore Technologies for Cultivars Identification

4.4.5. Application of Whole-Genome Resequencing as Super-Barcode

5. Bioinformatic Tools for Barcoding Data Analysis

6. Barcoding Databases and Libraries

7. Conclusions

Supplementary Materials

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI