Next Article in Journal
Functional Analysis of CbbHLH35 Reveals Its Role in Drought and Cold Stress Tolerance in Caladium bicolor
Previous Article in Journal
Spider Mite Response, Agronomic Performance, and Stability of a Urochloa spp. Diversity Panel Under Field Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genome-Wide InDel Marker Development and Genetic Diversity Analysis of 52 Tomato Germplasm Accessions

1
Center for Genomics and Biotechnology, Fujian Provincial Key Laboratory of Haixia Plant Systems Biology, Haixia Institute of Science and Technology, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China
2
Fujian Key Laboratory of Vegetable Genetics and Breeding, Crops Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou 350002, China
3
Institute of Horticulture, Ningxia Academy of Agriculture and Forestry Sciences, Yinchuan 750002, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Plants 2026, 15(7), 1118; https://doi.org/10.3390/plants15071118
Submission received: 10 March 2026 / Revised: 3 April 2026 / Accepted: 3 April 2026 / Published: 6 April 2026

Abstract

To address the challenges of narrow genetic backgrounds and low phenotypic selection efficiency in tomato breeding, comparative genomics was applied. Based on the genomic sequences of five tomato varieties (‘Micro-Tom’, ‘Moneymaker’, ‘M82’, ‘Heinz 1706’, and ‘LA2093’), a total of 285,796 InDel loci were preliminarily identified. Based on these loci, a total of 255 pairs of molecular markers were developed. Subsequently, based on InDel length, polymorphism, and electrophoretic performance, 63 InDel markers with stable amplification, clear polymorphic bands, and coverage across all 12 chromosomes were rigorously selected. These markers were subsequently used to analyze the genetic diversity of 52 tomato germplasm resources. The polymorphism information content (PIC) values of the markers ranged from 0.074 to 0.402, with an average of 0.2804. Cluster analysis based on InDel genotyping data divided the 52 germplasm samples into four distinct groups with significant genetic differentiation, which was validated in conjunction with previously collected phenotypic data from the 52 tomato germplasm resources. Furthermore, a set of core InDel primer combinations (24 pairs) was selected to construct unique DNA fingerprint profiles for each germplasm group. Overall, the InDel markers developed in this study provide an efficient tool for evaluating genetic diversity in tomato germplasm and offer a reliable molecular basis for germplasm identification, heterosis prediction, and marker-assisted breeding, thereby facilitating the development of improved tomato cultivars.

1. Introduction

Tomato (Solanum lycopersicum) is a nutritious, brightly colored vegetable crop with a distinctive aroma. Its high content of lycopene, vitamin C, beta-carotene, and other bioactive compounds, together with its sweet and sour flavor, makes it an indispensable part of the daily diet, and gives it considerable economic and nutritional value [1]. Tomatoes originated in the Andes Mountains of South America. After being introduced to Europe in the 15th century, and through hundreds of years of domestication and improvement, the morphological characteristics and agronomic traits of tomatoes have been significantly enhanced, gradually developing into the modern cultivated tomato [2]. During the long-term domestication and breeding of tomatoes, continuous artificial selection and genetic bottleneck effects have significantly reduced the genetic diversity of cultivated varieties, posing a major challenge to the genetic improvement of important agronomic traits [3].
In recent years, with the rapid development of molecular breeding technology, molecular marker-assisted selection of genes for superior agronomic traits has gradually become an important strategy of tomato breeding [4]. Molecular breeding mainly involves developing molecular markers that are closely linked to target genes, followed by the use of these markers to screen for desired traits during the breeding process [5,6,7]. Molecular markers are genetic markers based on nucleotide sequence variation among individuals. They reflect genetic polymorphism at the DNA level and are widely used in germplasm identification, marker-assisted breeding, and genetic improvement. Compared to traditional morphological, cytological, and protein markers, molecular markers have several advantages. First, they are not affected or limited by environmental conditions or gene expression, making them more reliable and stable in genetic studies. Second, a large number of molecular markers are available across the genome, providing more comprehensive and detailed data for genetic research [8,9]. Based on the methods to detect DNA polymorphisms, molecular markers can generally be divided into four categories: (1) RFLP (Restriction Fragment Length Polymorphism) markers based on Southern hybridization detection; (2) PCR-based molecular markers, such as RAPD (Random Amplified Polymorphic DNA), SSRs (Simple Sequence Repeats), ISSRs (Inter-simple Sequence Repeats), and SRAPs (Sequence-related Amplified Polymorphisms) [10]; (3) markers based on a combination of PCR and restriction enzyme digestion, such as AFLPs (Amplified Fragment Length Polymorphisms) and CAPSs (Cleaved Amplified Polymorphic Sequences) [11]; and (4) newly developed markers based on sequence variation, such as InDels (Insertion-Deletions) and SNPs (Single Nucleotide Polymorphisms) [12,13]. These marker systems play an important role in genetic polymorphism studies, and provide strong technical support for germplasm identification, studies of genetic evolution, and molecular breeding [14,15]. Among numerous molecular markers, InDel markers are widely used due to their simple structure, high polymorphism rate, co-dominant characteristics, and close linkage with target traits. These markers have been successfully applied in tomato genetic improvement studies, covering key breeding objectives such as disease resistance, quality enhancement, and stress tolerance. InDel markers are widely distributed in the genome, easy to detect, and highly reproducible, making them very suitable for high-density molecular marker development and are widely applied in gene mapping, association studies, and genetic linkage map construction.
With the completion of the tomato reference genome sequencing and assembly [16,17,18], many important traits in tomato have been finely mapped. To date, more than 100 genes in tomatoes have been cloned or precisely mapped. Although considerable progress has been made in identifying functional genes controlling important tomato traits, there is still a lack of rapid, efficient, and low-cost molecular markers suitable for breeding applications [19,20]. Therefore, the development of tomato DNA molecular markers is expected to provide precise and efficient tools for marker-assisted breeding, molecular genetic mapping, and the study of functional genes involved in abiotic stress responses, thereby promoting rapid progress in related research fields [21,22].
In this study, genomic sequences from five tomato varieties—‘Micro-Tom’, ‘Moneymaker’, ‘M82’, ‘Heinz 1706’, and ‘LA2093’—were used to identify genome-wide InDel loci and develop a set of easily detectable molecular markers. A total of 63 InDel markers distributed across the 12 tomato chromosomes were selected and subsequently used to analyze the genetic diversity of 52 tomato germplasm accessions. In addition, DNA fingerprinting profiles were constructed based on these markers. The results provide useful molecular tools for germplasm identification, genetic diversity evaluation, and marker-assisted breeding, thereby contributing to the efficient utilization and genetic improvement of tomato germplasm resources.

2. Results

2.1. Identification and Statistical Analysis of Genome-Wide InDel Molecular Markers

To develop InDel molecular markers across the entire tomato genome, conserved sequence fragments containing InDel polymorphisms were identified based on genomic data from five tomato varieties: ‘Micro-Tom’, ‘Moneymaker’, ‘M82’, ‘Heinz 1706’, and ‘LA2093’. A total of 285,796 InDel loci were detected, including 121,933 insertions (42.66%) and 163,863 deletions (57.34%). To better illustrate the distribution of InDel molecular markers across the chromosomes, genes, InDel loci, and selected molecular markers were statistically analyzed in 50 kb intervals. As shown in Figure 1, the outermost circle indicates the chromosomal start positions, while genes (blue), InDel loci (green), and selected molecular markers (pink) are sequentially displayed from the outer to the inner tracks. The number of InDel loci across the 12 chromosomes ranged from 19,553 (Chr11) to 27,078 (Chr09) (Table 1). In terms of chromosomal distribution, the number of InDel sites ranged from 19,553 (Chr11) to 27,078 (Chr09), accounting for 6.84% to 9.47% of the total InDels, respectively. After correcting for chromosome length, there was no significant difference in InDel density among the chromosomes. However, as illustrated in Figure 1, the distribution of InDel sites along individual chromosomes was uneven. Regions with high gene densities contained fewer InDels, whereas InDels were preferentially enriched in gene-sparse regions. Furthermore, InDel density near centromeric regions was significantly reduced, consistent with the reduced detection efficiency in regions rich in repetitive sequences. The final set of 63 InDel markers (pink track) was evenly distributed across the genome, providing comprehensive coverage for subsequent genetic analyses.

2.2. Development and Screening of InDel Markers

To establish a high-quality set of InDel markers, loci with InDel lengths of ≥6 bp and polymorphisms between at least two tomato accessions were selected for marker development. A total of 255 candidate InDel loci distributed across the 12 tomato chromosomes were identified for further development of PCR-based markers (Table S1). Target sequences of 70–150 bp flanking the corresponding InDel loci were extracted as templates for primer design. PCR amplification was subsequently performed using DNA from 52 tomato germplasm accessions to evaluate the amplification efficiency and polymorphism of the designed primers. Based on the clarity, reproducibility, and polymorphism of the amplified bands, loci showing no or low polymorphism were discarded. Ultimately, 63 InDel markers (24.7%) that could be stably amplified and displayed clear polymorphism were selected. The PCR amplification results of 10 representative InDel primer pairs across the 52 tomato accessions are presented in Figure S1, while Figure 2 shows the amplification profiles of four representative InDel markers. Detailed information for the 63 InDel primer pairs is provided in Table S2. These markers are distributed across the entire tomato genome, with an average of 5.25 markers per chromosome. Among the 12 chromosomes, chromosome 12 contains the fewest markers (2), whereas chromosome 9 contains the most markers (11) (Figure 3).

2.3. Polymorphism Analysis of InDel Markers

Polymorphism Information Content (PIC) is an important indicator for measuring the polymorphism of genetic markers. A higher PIC value indicates greater marker polymorphism and provides more genetic information. We analyzed 52 tomato accessions using 63 core InDel markers. The results showed that the number of alleles amplified by each InDel marker ranged from 2 to 3, and the PIC values ranged from 0.038 to 0.402 (Table 2), with an average PIC of 0.307. Among these markers, 48 markers exhibited relatively high polymorphism (PIC > 0.25). These results demonstrate that the InDel markers developed in this study show moderate to high levels of polymorphism and are suitable for genetic diversity analysis of tomato germplasm.

2.4. Application of InDel Markers in Tomato Germplasm Analysis

A Neighbor-Joining (NJ) phylogenetic tree was constructed based on the genetic distance matrix to evaluate the genetic relationships among the 52 tomato accessions. The accessions were classified into four major groups (G1–G4) (Figure 4). Group G1 contained 11 accessions that shared similar phenotypic characteristics, including determinate growth habit, yellow or light-yellow flowers, red fruits, round or oblate fruit shape, and comparable soluble solid contents, suggesting a relatively narrow genetic background. Group G2 consisted of three accessions, all of which were wild small-fruited tomatoes. Group G3 included 15 accessions that were predominantly characterized by an indeterminate growth habit, red fruits with round or oblate shapes, and medium to high soluble solid contents. Group G4 represented the largest and most genetically diverse group, comprising 23 accessions, most of which were cherry tomato types. Overall, the developed InDel markers effectively distinguished different tomato germplasm types and revealed clear phylogenetic relationships among the tested accessions.
To further investigate the population structure, STRUCTURE analysis was performed. The cross-validation error (CV_error) curve indicated that the optimal number of genetic clusters was K = 4 (Figure 5a). The corresponding population structure plot (Figure 5b) showed that the 52 tomato accessions could be clearly divided into four genetic subpopulations (A, B, C, and D). Accessions within each subpopulation exhibited relatively homogeneous genetic composition, while clear genetic differentiation was observed among subpopulations, indicating distinct genetic backgrounds among the tested materials.
Principal component analysis (PCA) further supported the population structure results. In the PCA plot (Figure 5c), the first two principal components (PC1 and PC2) clearly separated the 52 accessions into four groups, which was consistent with the STRUCTURE analysis (K = 4). This indicates that the genetic differences among the four groups can be effectively distinguished through principal component analysis, further verifying the reliability of the population structure division. The different groups show obvious spatial distribution differences in the PCA plot, reflecting the independence of genetic variation among the groups, and also demonstrating that the molecular markers used have high polymorphism and discriminative power, fully revealing the genetic diversity among these tomato accessions.

2.5. Construction of DNA Fingerprinting

The phenotypic survey of the 52 tomato accessions revealed considerable morphological variation. Among them, 31 accessions (60%) exhibited an indeterminate growth habit, characterized by continuous flowering and fruiting with a relatively long growing season, whereas 21 accessions (40%) were determinate with a more concentrated growth cycle. Leaf morphology was classified into four major types: compound narrow leaves, compound broad leaves, common leaves, and sweet potato leaves. Among these, compound narrow-leaf types accounted for the highest proportion (approximately 45%), including accessions such as T20, T49, and T29. Inflorescence types were mainly dichotomous (approximately 60%) and simple (approximately 35%). Fruit shapes were predominantly round or oblate. The number of locules ranged from 1.88 to 9. Approximately 60% of the accessions had 2–4 locules, whereas about 40% contained 6–9 locules (Table S3). These results indicate substantial phenotypic diversity among the tested tomato germplasm.
All 63 InDel markers developed in this study exhibited clear polymorphism across the 52 tomato accessions, enabling complete discrimination among the tested varieties. Therefore, these markers can be used for variety identification and DNA fingerprint construction in tomato germplasm.
From the 12 tomato chromosomes, two to three biallelic InDel markers with relatively high PIC values were selected per chromosome to construct the DNA fingerprinting system. A total of 24 markers were selected, including T1M0546, T1M6565, T2M0112, T2M3180, T3M5319, T4M0210, T4M6000, T5M0701, T5M1237, T5M2207, T6M3024, T6M3582, T7M0019, T7M5397, T8M2758, T8M5194, T9M0714, T9M0982, TaM0641, TaM6451, TbM0724, TbM1162, TbM3021, and TcM0679. The genotyping results of these 24 markers across the 52 tomato accessions were recorded based on banding patterns. The absence of a band was denoted as “0”, while the presence of bands was represented by codes “1” to “4”. Specifically, “1” indicated a band consistent with the Heinz 1706 reference genome; “2” indicated the presence of an InDel relative to the Heinz 1706 genome; “3” represented heterozygous bands; and “4” indicated an InDel different in length from the type represented by code “2”. Based on these scoring criteria, the genotypes of the 24 markers were converted into 24-digit genotype codes to construct DNA fingerprints for the 52 tomato accessions (Table 3).

3. Discussion

Molecular markers have become essential tools for crop genetic research and modern breeding programs. With the rapid development of high-throughput sequencing technologies and the continuous reduction in sequencing costs, multiple reference genomes of tomato and its related wild species have been released in recent years [23,24,25]. These genomic resources have enabled the identification of large numbers of sequence variations across the tomato genome, thereby facilitating the development of highly efficient molecular markers.
Compared with early molecular marker systems such as RFLP and RAPD, genome-based markers offer higher resolution and reliability for genetic analysis. Among the currently available markers, SNPs represent the most abundant type of genetic variation in plant genomes; however, their detection often requires specialized equipment and relatively high costs, which limits their routine application in breeding programs. In contrast, SSR and InDel markers can be easily detected through PCR amplification followed by gel electrophoresis. Nevertheless, SSR loci consist of tandem repeat sequences, and large variations in repeat numbers may affect primer binding efficiency and amplification stability. InDel markers, by contrast, are widely distributed across the genome and are typically flanked by conserved sequences, which allows for the design of primers with higher specificity and more stable amplification efficiency [26].
In the present study, genome sequence data from five tomato accessions were compared to identify candidate insertion–deletion loci, from which, 63 core InDel markers with stable polymorphism were developed and evenly distributed across the 12 chromosomes of the tomato genome. The polymorphism and stability of these markers were validated using 52 tomato germplasm accessions. The results indicate that the developed InDel markers possess good amplification stability and relatively high polymorphism, making them suitable for genetic diversity analysis and germplasm identification. In addition, the wide application of InDel markers may facilitate more accurate analyses of population structure and pedigree relationships in tomato breeding materials, thereby contributing to studies on tomato genetic evolution and germplasm classification [27,28].
Germplasm resources represent the fundamental basis for crop genetic improvement. Evaluating the genetic diversity of tomato germplasm is essential for broadening the genetic base and identifying valuable breeding materials [29,30]. In this study, cluster analysis based on InDel markers grouped the 52 tomato accessions into four major clusters. The population structure analysis using STRUCTURE (V2.3.4) software produced results largely consistent with the clustering analysis, further supporting the reliability of the developed markers. Notably, three wild small-fruited tomato accessions (T1, T2, and T3) formed an independent subgroup, whereas the remaining accessions belonged to cultivated tomato groups. Furthermore, the clustering patterns were closely associated with several phenotypic traits, such as growth habit, fruit shape, and fruit color, indicating that the InDel markers developed in this study can effectively differentiate different types of tomato germplasm and reflect their phylogenetic relationships.
In this study, cluster analysis based on InDel markers can roughly divide 52 tomato germplasm accessions into four major clusters. At the same time, the population structure of the 52 tomato germplasm accessions was also analyzed using Structure software, and the population analysis results were basically consistent with the cluster analysis results. Three wild small-fruited tomato germplasm accessions, T1, T2, and T3, constituted a separate subgroup, while the remaining germplasm were cultivated tomatoes. The results of the cluster analysis showed a significant correlation with their traits (such as growth type, fruit shape, and fruit color), indicating that the developed InDel markers can effectively distinguish different types of tomato varieties or germplasm resources and reflect the phylogenetic relationships among the tested materials. These results demonstrate that the tomato germplasm analyzed in this study contains substantial genetic variation and provides valuable genetic resources for future breeding programs. The developed marker system may also support various breeding applications, including QTL mapping of important agronomic traits, parental selection, and germplasm identification. By enabling breeders to select parental materials with complementary genetic backgrounds according to breeding objectives—such as high yield, improved fruit quality, or enhanced storage and transport tolerance—the use of these markers may accelerate the process of tomato variety improvement.
Overall, the InDel markers developed in this study exhibit good polymorphism and genome-wide distribution across all 12 tomato chromosomes. These markers are easy to use and provide reliable molecular tools for tomato genetic diversity analysis, germplasm identification and evaluation, functional gene mining, marker-assisted selection, and variety purity testing.

4. Materials and Methods

4.1. Test Materials

This study selected five tomato germplasm accessions—cultivated varieties S. lycopersicum ‘Micro-Tom’, ‘Moneymaker’, ‘M82’, and ‘Heinz 1706’, and wild tomato S. pimpinellifolium ‘LA2093’—for InDel identification. In addition, 52 tomato resources (Table S3) were used for genetic diversity analysis. This batch of tomato germplasm resources includes various types such as wild small-fruited tomatoes, cherry tomatoes, and highly inbred self-pollinated lines. Among them, wild small-fruited tomatoes retain the original genetic characteristics of tomatoes; cherry tomatoes combine both edible value and genetic diversity; and highly inbred self-pollinated lines have a stable genetic background. These 52 varieties were provided by the Ningxia Academy of Agriculture and Forestry Sciences in Yinchuan, Ningxia Hui Autonomous Region, and their agronomic traits were observed and recorded.

4.2. Acquisition of the Reference Genome

The tomato reference genomes for ‘Micro-Tom’, ‘Moneymaker’, ‘M82’, ‘Heinz 1706’, and ‘LA2093’ were obtained from the National Center for Biotechnology Information (NCBI) database or the tomato database (http://solomics.agis.org.cn/). Download links for ‘Micro-Tom’: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11897730/ (accessed on 15 January 2025); ‘Moneymaker’: http://solomics.agis.org.cn/tomato/datasource/2/ (accessed on 15 January 2025); ‘M82’: https://www.ncbi.nlm.nih.gov/assembly/GCA_900008105.1/ (accessed on 15 January 2025); ‘Heinz 1706’: https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000188115.4/ (accessed on 15 January 2025); ‘LA2093’: https://www.ncbi.nlm.nih.gov/sra/?term=LA2093+Solanum+pimpinellifolium (accessed on 15 January 2025).

4.3. Indel Molecular Marker Screening and Primer Design

Based on Indel sites obtained from whole-genome alignment (a total of 285,796 sites), the following stepwise screening was conducted to obtain candidate sites suitable for subsequent marker development.

4.3.1. Basic Filtering

The raw Indel set was initially filtered using BCF tools (v1.15), retaining sites that met the following criteria: Quality value (QUAL) ≥ 30; Indel length greater than or equal to 6 bp; Exclude Indels located in centromeres, telomeres, and known tandem repeat regions (based on Repeat Masker annotation); Extract 200 bp sequences on both sides of the Indel and align them with the reference genome to ensure they are located in non-repetitive, single-copy regions, avoiding amplification complexity caused by repeated elements.

4.3.2. Genome Distribution Screening

To obtain a genome-wide evenly distributed marker set, Indels passing the basic filtering were sorted according to their physical chromosomal positions, and neighboring sites less than 100 kb apart were removed to ensure uniform marker distribution along the chromosomes. Additionally, based on genome annotation information, Indels were categorized into genic and intergenic regions, and sites were retained in proportion to balance the functional distribution of markers.

4.3.3. Primer Design and Amplifiability Evaluation

For the retained Indel sites, 250 bp flanking sequences upstream and downstream were extracted, and InDel marker primers were designed using Primer 6.0 software. Primer design parameters were set as follows: primer length of 21–30 bp, Tm of 57–62 °C, GC content of 40–70%, product size of 70–150 bp, and primers should avoid forming dimers or hairpins with themselves or each other. Sites for which qualified primers could not be designed were excluded.

4.3.4. Population Polymorphism and Experimental Validation

Four tomato materials with high genetic diversity were selected to perform PCR amplification and polyacrylamide gel electrophoresis testing on the above sites. Sites with failed amplification, blurry bands, or non-specific bands were excluded. Polyacrylamide gel electrophoresis was further used to confirm the length differences between Indels from the two parental materials, retaining sites with clear differences and single peak patterns.

4.3.5. Final Marker Set

Through the above stepwise screening, 255 Indel molecular markers were ultimately obtained, which are evenly distributed on the genome, stably amplifiable, and highly polymorphic, suitable for subsequent genetic analysis and marker development. Among them, 63 markers showed significant population polymorphism in these 52 tomato varieties. The primers for the selected markers were synthesized by Shanghai Sangong Biotechnology Co., Ltd. (Shanghai, China), and were labeled using the naming format ‘Chromosome number + primer position.’

4.4. PCR Amplification and Polyacrylamide Gel Electrophoresis

PCR amplification and detection of the screened InDel marker were performed using 52 tomato materials. When the third true leaf of each of the 52 material samples was fully expanded, healthy seedlings were selected, and small amounts of young leaves were taken and placed in 2 mL centrifuge tubes. Genomic DNA was extracted using the CTAB method. The PCR reaction system was 20 μL in volume, containing approximately 10–100 ng of DNA template, 10 μL of 2× PCR mixing buffer, 1 μL each of forward and reverse primers (10 μM), and ddH2O up to 20 μL. The PCR reaction procedure was as follows: 95 °C denaturation for 3 min, 94 °C denaturation for 30 s, 58 °C annealing for 30 s, 72 °C extension for 20 s, and 75 °C extension for 5 min, for 28 cycles. PCR amplification was performed using a Bio-rad thermal cycler.
After the PCR reaction was completed, the PCR products were subjected to gel electrophoresis using an 8% polyacrylamide gel. The polyacrylamide gel consisted of 30% polyacrylamide solution, 10% ammonium persulfate (APS) solution, tetramethylethylenediamine solution (TEMED), 10× TBE solution, and ddH2O. After electrophoresis, the gel was first fixed with a fixative solution prepared with anhydrous ethanol and glacial acetic acid, then stained with 0.1% AgNO3 solution, and finally developed with an aqueous solution of sodium hydroxide, sodium tetraborate, and formaldehyde. The electrophoretic bands were counted after the gel was irradiated.

4.5. Data Analysis

The marker polymorphism information content (PIC) was calculated using POWERMARKER V3.25 software [31]. Based on the genotyping results, genetic distance was calculated using MEGA 7.0 software, and cluster analysis was performed using the UPGMA algorithm in MEGA software [32]. DNA fingerprints were then constructed. The population genetic structure of tomato germplasms were analyzed using STRUCTURE 2.3.4 software [33]. The parameters were set as follows: K value was set to 1–10, and each K value was calculated 10 times. The results were imported into Structure Selector (V2.3.4) software [34]. The optimal number of groups K was determined based on lnP(K) and ΔK value. Principal component analysis of the tomato population was performed using Plink 1.9 software [35].

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants15071118/s1, Figure S1: PCR amplification profiles of 52 tomato accessions using ten InDel primer pairs; Table S1: Primer information for total 255 developed Indel markers in Tomato. Table S2: Primer information for 63 core Indel markers in Tomato. Table S3: The main morphological characteristics and agronomic traits of 52 tomato varieties.

Author Contributions

Conceptualization, C.J., Y.Z. (Yunxia Zhao) and C.H.; methodology, C.J., Y.Z. (Yunxia Zhao) and C.H.; software, Z.G.; validation, C.H., Y.Z. (Yaxuan Zhang), D.G., Q.Z. and Y.W.; formal analysis, C.H. and D.G.; investigation, C.H., Y.Z. (Yunxia Zhao) and C.J.; resources, Q.Z. and Y.Z. (Yunxia Zhao); data curation, C.H., Y.Z. (Yunxia Zhao) and C.J.; writing—original draft preparation, C.H. and D.G.; writing—review and editing, C.J., C.H. and D.G.; visualization, Y.Z. (Yaxuan Zhang) and Z.G.; supervision, C.J.; project administration, Y.Z. (Yunxia Zhao) and C.J.; funding acquisition, C.J. and Y.Z. (Yunxia Zhao). All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Natural Science Foundation of Fujian (2023J01072 to C.J.), the High-quality Agricultural Development and Ecological Protection Technology Innovation Demonstration Project (NKYGZL-2026-10 to Y.Z.), the Ningxia Youth Top-Notch Talent Training Project (2022 to Y.Z.).

Data Availability Statement

All data are presented within the article.

Acknowledgments

We are grateful to Ray Ming, Fujian Agriculture and Forestry University, for providing our research group with an excellent platform and generous assistance, and Qing Chang, Nanjing Agricultural University, for editing the images.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
InDelInsertion-Deletion
PICPolymorphism Information Content
RFLPRestriction Fragment Length Polymorphisms
RAPDAmplified Polymorphic DNA
SSRSimple Sequence Repeat
ISSRInter-Simple Sequence Repeat
SRAPSequence-Related Amplified Polymorphism
AFLPAmplified Fragment Length Polymorphism
CAPSCleaved Amplified Polymorphic Sequences
SNPSingle-nucleotide polymorphism

References

  1. Bai, Y.; Lindhout, P. Domestication and breeding of tomatoes: What have we gained and what can we gain in the future? Ann. Bot. 2007, 100, 1085–1094. [Google Scholar] [CrossRef]
  2. Lin, T.; Zhu, G.; Zhang, J.; Xu, X.; Yu, Q.; Zheng, Z.; Zhang, Z.; Lun, Y.; Li, S.; Wang, X.; et al. Genomic analyses provide insights into the history of tomato breeding. Nat. Genet. 2014, 46, 1220–1226. [Google Scholar] [CrossRef]
  3. Klee, H.J.; Tieman, D.M. The genetics of fruit flavour preferences. Nat. Rev. Genet. 2018, 19, 347–356. [Google Scholar] [CrossRef]
  4. Du, M.; Sun, C.; Deng, L.; Zhou, M.; Li, J.; Du, Y.; Ye, Z.; Huang, S.; Li, T.; Yu, J.; et al. Molecular breeding of tomato: Advances and challenges. J. Integr. Plant Biol. 2025, 67, 669–721. [Google Scholar] [CrossRef] [PubMed]
  5. Arya, R.; Saha, S.; Bhadana, D.; Shah, P.; Verma, P.; Das, R.; Kumar, P.; Agarwal, S.; Rao, D.P.; Chaudhary, P. Advancements in Marker-Assisted Breeding for Crop Plants: A Comprehensive Review and Future Directions. Dizhen Dizhi 2023, 15, 1–10. [Google Scholar]
  6. Duan, Y.; He, Y.; Shu, Q.; Ma, W.; Zhang, M.; Liao, Q.; Shi, Y.; Bo, K.; Zhong, Y.; Wang, C. Construction of InDel marker—Anchored genetic maps and identification of QTLs governing fruit quality - related traits in winter squash (Cucurbita maxima). Euphytica 2025, 221, 17. [Google Scholar] [CrossRef]
  7. Tan, X.; Zeng, W.; Yang, Y.; Lin, Z.; Li, F.; Liu, J.; Chen, S.; Liu, Y.-G.; Xie, W.; Xie, X. Genome-wide profiling of polymorphic short tandem repeats and their influence on gene expression and trait variation in diverse rice populations. J. Genet. Genom. 2025, 52, 733–746. [Google Scholar] [CrossRef]
  8. Ramesh, P.; Mallikarjuna, G.; Sameena, S.; Kumar, A.; Gurulakshmi, K.; Reddy, B.V.; Reddy, P.C.O.; Sekhar, A.C. Advancements in molecular marker technologies and their applications in diversity studies. J. Biosci. 2020, 45, 15. [Google Scholar] [CrossRef]
  9. Nadeem, M.A.; Nawaz, M.A.; Shahid, M.Q.; Doğan, Y.; Comertpay, G.; Yıldız, M.; Hatipoğlu, R.; Ahmad, F.; Alsaleh, A.; Labhane, N.; et al. DNA molecular markers in plant breeding: Current status and recent advancements in genomic selection and genome editing. Biotechnol. Biotec. Eq. 2018, 32, 261–285. [Google Scholar] [CrossRef]
  10. Shiran, B.; Azimkhani, R.; Mohammadi, S.; Ahmadi, M.R. Potential Use of Random Amplified Polymorphic DNA Marker in Assessment of Genetic Diversity and Identification of Rapeseed (Brassica napus L.) Cultivars. Biotechnology 2006, 5, 153–159. [Google Scholar] [CrossRef]
  11. Sheeja, T.E.; Kumar, I.P.V.; Giridhari, A.; Minoo, D.; Rajesh, M.K.; Babu, K.N. Amplified Fragment Length Polymorphism: Applications and Recent Developments. Methods Mol. Biol. 2021, 2222, 187–218. [Google Scholar] [CrossRef]
  12. Amiteye, S. Basic concepts and methodologies of DNA marker systems in plant molecular breeding. Heliyon 2021, 7, e8093. [Google Scholar] [CrossRef]
  13. Guo, Z.; Yang, Q.; Huang, F.; Zheng, H.; Sang, Z.; Xu, Y.; Zhang, C.; Wu, K.; Tao, J.; Prasanna, B.M.; et al. Development of high-resolution multiple-SNP arrays for genetic analyses and molecular breeding through genotyping by target sequencing and liquid chip. Plant Commun. 2021, 2, 100230. [Google Scholar] [CrossRef]
  14. La Malfa, S.; Bennici, S. Genetics and Molecular Breeding of Fruit Tree Species. Horticulturae 2025, 11, 756. [Google Scholar] [CrossRef]
  15. Moreno-Contreras, V.I.; Delgado-Gardea, M.C.E.; Ramos-Hernández, J.A.; Mendez-Tenorio, A.; Varela-Rodríguez, H.; Sánchez-Ramírez, B.; Muñoz-Ramírez, Z.Y.; Infante-Ramírez, R. Genome-Wide Identification and Characterization of SNPs and InDels of Capsicum annuum var. glabriusculum from Mexico Based on Whole Genome Sequencing. Plants 2024, 13, 3248. [Google Scholar] [CrossRef] [PubMed]
  16. Chen, Y.; Tian, J.; Zhao, Y.; Zhang, J.; Liang, C. A telomere-to-telomere reference genome assembly of tomato cultivar Heinz 1706. Plant Commun. 2025, 101618. [Google Scholar] [CrossRef] [PubMed]
  17. Shirasawa, K.; Ariizumi, T. Near-complete genome assembly of tomato (Solanum lycopersicum) cultivar Micro-Tom. Plant Biotechnol. 2024, 41, 367–374. [Google Scholar] [CrossRef] [PubMed]
  18. Bolger, A.; Scossa, F.; Bolger, M.E.; Lanz, C.; Maumus, F.; Tohge, T.; Quesneville, H.; Alseekh, S.; Sørensen, I.; Lichtenstein, G.; et al. The genome of the stress-tolerant wild tomato species Solanum pennellii. Nat. Genet. 2014, 46, 1034–1038. [Google Scholar] [CrossRef]
  19. Foolad, M.R. Genome Mapping and Molecular Breeding of Tomato. Int. J. Plant Genom. 2007, 2007, 064358. [Google Scholar] [CrossRef] [PubMed]
  20. Chaudhary, J.; Alisha, A.; Bhatt, V.; Chandanshive, S.; Kumar, N.; Mir, Z.; Kumar, A.; Yadav, S.K.; Shivaraj, S.M.; Sonah, H.; et al. Mutation Breeding in Tomato: Advances, Applicability and Challenges. Plants 2019, 8, 128. [Google Scholar] [CrossRef]
  21. Qi, S.; Meng, L.Z.; Lou, Q.; Li, Y.; Shen, Y.; Zhang, S.; Wang, X.; Zhao, P.; Wang, J.; Wang, B.; et al. Association of the tomato co-chaperone gene Sldnaj harboring a promoter deletion with susceptibility to Tomato spotted wilt virus (TSWV). Hortic. Res. 2025, 12, 13. [Google Scholar] [CrossRef]
  22. Torgeman, S.; Pleban, T.; Goldberg, Y.; Ferrante, P.; Aprea, G.; Giuliano, G.; Yichie, Y.; Fisher, J.; Zemach, I.; Koch, A.; et al. Solanum pennellii (LA5240) backcross inbred lines (BILs) for high resolution mapping in tomato. Plant J. 2024, 119, 595–603. [Google Scholar] [CrossRef]
  23. Yu, X.; Qu, M.; Shi, Y.; Hao, C.; Guo, S.; Fei, Z.; Gao, L. Chromosome-scale genome assemblies of wild tomato relatives Solanum habrochaites and Solanum galapagense reveal structural variants associated with stress tolerance and terpene biosynthesis. Hortic. Res. 2022, 9, uhac139. [Google Scholar] [CrossRef] [PubMed]
  24. Li, N.; He, Q.; Wang, J.; Wang, B.; Zhao, J.; Huang, S.; Yang, T.; Tang, Y.; Yang, S.; Aisimutuola, P.; et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cultivated tomato species. Nat. Genet. 2023, 55, 852–860. [Google Scholar] [CrossRef]
  25. Sato, S.; Tabata, S.; Hirakawa, H.; Asamizu, E.; Shirasawa, K.; Isobe, S.; Kaneko, T.; Nakamura, Y.; Shibata, D.; Aoki, K. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 2012, 485, 635–641. [Google Scholar] [CrossRef]
  26. Oliveira, M.; Azevedo, L. Molecular Markers: An Overview of Data Published for Fungi over the Last Ten Years. J. Fungi 2022, 8, 803. [Google Scholar] [CrossRef]
  27. Adedze, Y.M.N.; Lu, X.; Xia, Y.; Sun, Q.; Nchongboh, C.G.; Alam, A.; Liu, M.; Yang, X.; Zhang, W.; Deng, Z.; et al. Agarose-resolvable InDel markers based on whole genome re-sequencing in cucumber. Sci. Rep. 2021, 11, 3872. [Google Scholar] [CrossRef]
  28. Pan, G.; Li, Z.; Huang, S.; Tao, J.; Shi, Y.; Chen, A.; Li, J.; Tang, H.; Chang, L.; Deng, Y.; et al. Genome-wide development of insertion-deletion (InDel) markers for Cannabis and its uses in genetic structure analysis of Chinese germplasm and sex-linked marker identification. BMC Genom. 2021, 22, 595. [Google Scholar] [CrossRef] [PubMed]
  29. Pons, C.; Casals, J.; Brower, M.; Sacco, A.; Riccini, A.; Hendrickx, P.; del Rosario Figás, M.; Fisher, J.; Grandillo, S.; Mazzucato, A.; et al. Diversity and genetic architecture of agro-morphological traits in a core collection of European traditional tomato. J. Exp. Bot. 2023, 74, 5896–5916. [Google Scholar] [CrossRef]
  30. Farinon, B.; Picarella, M.E.; Siligato, F.; Rea, R.; Taviani, P.; Mazzucato, A. Phenotypic and Genotypic Diversity of the Tomato Germplasm from the Lazio Region in Central Italy, with a Focus on Landrace Distinctiveness. Front. Plant Sci. 2022, 13, 16. [Google Scholar] [CrossRef] [PubMed]
  31. Botstein, D.; White, R.L.; Skolnick, M.; Davis, R.W. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 1980, 32, 314–331. [Google Scholar]
  32. Kumar, S.; Stecher, G.; Tamura, K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol. Biol. Evol. 2016, 33, 1870–1874. [Google Scholar] [CrossRef] [PubMed]
  33. Raj, A.; Stephens, M.; Pritchard, J.K. fastSTRUCTURE: Variational inference of population structure in large SNP data sets. Genetics 2014, 197, 573–589. [Google Scholar] [CrossRef] [PubMed]
  34. Li, Y.; Liu, J. StructureSelector: A web-based software to select and visualize the optimal number of clusters using multiple methods. Mol. Ecol. Resour. 2018, 18, 176–177. [Google Scholar] [CrossRef] [PubMed]
  35. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef]
Figure 1. Genome-wide distribution of InDel loci and selected molecular markers in the tomato genome. The outermost layer represents chromosomes (red). From the second ring inwards, the layers are arranged in the order: genes (blue), InDels (≥6 bp, green), and selected InDel molecular markers (pink). Each 50 kb interval was used as a statistical unit.
Figure 1. Genome-wide distribution of InDel loci and selected molecular markers in the tomato genome. The outermost layer represents chromosomes (red). From the second ring inwards, the layers are arranged in the order: genes (blue), InDels (≥6 bp, green), and selected InDel molecular markers (pink). Each 50 kb interval was used as a statistical unit.
Plants 15 01118 g001
Figure 2. PCR amplification profiles of 52 tomato accessions using four InDel primer pairs. HZ1706, TOM and AC are the reference genome varieties, while T1–T52 indicate the 52 tomato accessions.
Figure 2. PCR amplification profiles of 52 tomato accessions using four InDel primer pairs. HZ1706, TOM and AC are the reference genome varieties, while T1–T52 indicate the 52 tomato accessions.
Plants 15 01118 g002
Figure 3. Chromosomal distribution of the 63 core InDel markers in the tomato genome.
Figure 3. Chromosomal distribution of the 63 core InDel markers in the tomato genome.
Plants 15 01118 g003
Figure 4. Cluster analysis of 52 tomato germplasm accessions based on InDel molecular markers.
Figure 4. Cluster analysis of 52 tomato germplasm accessions based on InDel molecular markers.
Plants 15 01118 g004
Figure 5. Population structure and principal component analysis of 52 tomato germplasm accessions. (a) The cross-validation error (CV_error) curve for determining the optimal number of genetic clusters; (b) Population structure bar plot of the 52 tomato accessions; (c) Principal Component Analysis (PCA) plot.
Figure 5. Population structure and principal component analysis of 52 tomato germplasm accessions. (a) The cross-validation error (CV_error) curve for determining the optimal number of genetic clusters; (b) Population structure bar plot of the 52 tomato accessions; (c) Principal Component Analysis (PCA) plot.
Plants 15 01118 g005
Table 1. InDel loci in the whole tomato genome.
Table 1. InDel loci in the whole tomato genome.
ChromosomeInsertionDeletionTotalInDel/%
Chr01749813,64821,1467.40
Chr02805213,76621,8187.63
Chr0312,13714,72126,8589.40
Chr0411,57915,16626,7459.36
Chr0511,37814,64426,0229.11
Chr06786512,45820,3237.11
Chr0711,31314,31625,6298.97
Chr0810,75814,98025,7389.01
Chr0912,54414,53427,0789.47
Chr1010,92513,16524,0908.43
Chr11854211,01119,5536.84
Chr12934211,45420,7967.28
Total121,933163,863285,796100.00
Table 2. Allele number and polymorphism information content (PIC) of 63 core InDel markers across 52 tomato accessions.
Table 2. Allele number and polymorphism information content (PIC) of 63 core InDel markers across 52 tomato accessions.
ChromosomeMarkerAllele NumberPICChromosomeMarkerAllele NumberPIC
Ch01T1M054630.390Ch06T6M358220.309
Ch01T1M389920.355Ch06T6M395320.232
Ch01T1M457520.298Ch06T6M462620.383
Ch01T1M553330.402Ch07T7M001920.038
Ch01T1M600920.307Ch07T7M075620.201
Ch01T1M656520.258Ch07T7M100520.111
Ch02T2M011220.134Ch07T7M539720.185
Ch02T2M111420.074Ch08T8M275820.143
Ch02T2M318020.124Ch08T8M370620.186
Ch02T2M359220.373Ch08T8M519420.141
Ch02T2M442020.145Ch09T9M071420.299
Ch02T2M511320.284Ch09T9M098220.138
Ch03T3M049720.378Ch09T9M116320.368
Ch03T3M531920.359Ch09T9M215920.299
Ch03T3M540830.376Ch09T9M315420.299
Ch04T4M021020.368Ch09T9M351520.304
Ch04T4M215820.384Ch09T9M392620.299
Ch04T4M472720.384Ch09T9M411820.299
Ch04T4M600020.384Ch09T9M431420.299
Ch05T5M025120.299Ch09T9M494020.299
Ch05T5M070120.285Ch09T9M513720.309
Ch05T5M123720.357Ch10TaM064120.375
Ch05T5M124420.358Ch10TaM612720.382
Ch05T5M170920.291Ch10TaM645120.286
Ch05T5M220720.356Ch11TbM048820.128
Ch05T5M320720.090Ch11TbM072420.368
Ch05T5M371220.280Ch11TbM116220.357
Ch05T5M420820.344Ch11TbM254120.299
Ch05T5M470420.356Ch11TbM302120.376
Ch06T6M152320.295Ch12TcM067920.306
Ch06T6M302420.194Ch12TcM630020.172
Ch06T6M300320.193
Table 3. DNA fingerprints of 52 tomato germplasm accessions based on 24 core InDel markers.
Table 3. DNA fingerprints of 52 tomato germplasm accessions based on 24 core InDel markers.
AccessionDNA FingerprintingAccessionDNA Fingerprinting
T1111111111111311111111111T27212231221111221111111211
T2111131111111231111111211T28212231221112221111211211
T3111131011111231111211011T29422231222202221111210121
T4212212122212221111112121T30012231211112231111112211
T5212232122212221111220331T31210231221111221111111211
T6210232122212221111122122T32222221221112231121211211
T7222231122212221111212121T33422221211102231110222121
T8412232022212221111102122T34422221211112231111212311
T9410232122212221111102122T35212231211112221120112211
T10212231121112221111210211T36422231111111221121202121
T11212232122222221120111121T37212232122212221111210001
T12212231121112231210121122T38212231122222221121022121
T13212232121112221211212212T39212231121112221111211212
T14212232121111221111210211T40222221121122221111211122
T15212232122202221111112122T41010231221112221111211011
T16212231121122201122022121T42422201211112231111212121
T17212221121112222121112122T44212231221112221111121121
T18220032122212221111111121T43212232222212221211211120
T19212232121122231211111212T45012222222202222112211212
T20210232122212221121121121T46212221222212221111212121
T21210231220111221111111121T47210222221322222111221121
T22212231221111221111111211T48212022022212221121201121
T23322221211111222112112211T49210032022212221121211331
T24212221221111221111111121T50212020022212221111111122
T25212231221111221111111121T51212222020102221121121212
T26210222211112230110212211T52212230021112221111211122
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, C.; Ge, D.; Zhang, Y.; Ge, Z.; Wu, Y.; Zhang, Q.; Zhao, Y.; Ji, C. Genome-Wide InDel Marker Development and Genetic Diversity Analysis of 52 Tomato Germplasm Accessions. Plants 2026, 15, 1118. https://doi.org/10.3390/plants15071118

AMA Style

Huang C, Ge D, Zhang Y, Ge Z, Wu Y, Zhang Q, Zhao Y, Ji C. Genome-Wide InDel Marker Development and Genetic Diversity Analysis of 52 Tomato Germplasm Accessions. Plants. 2026; 15(7):1118. https://doi.org/10.3390/plants15071118

Chicago/Turabian Style

Huang, Chenjiao, Di Ge, Yaxuan Zhang, Zhiye Ge, Yicheng Wu, Qianrong Zhang, Yunxia Zhao, and Chonghui Ji. 2026. "Genome-Wide InDel Marker Development and Genetic Diversity Analysis of 52 Tomato Germplasm Accessions" Plants 15, no. 7: 1118. https://doi.org/10.3390/plants15071118

APA Style

Huang, C., Ge, D., Zhang, Y., Ge, Z., Wu, Y., Zhang, Q., Zhao, Y., & Ji, C. (2026). Genome-Wide InDel Marker Development and Genetic Diversity Analysis of 52 Tomato Germplasm Accessions. Plants, 15(7), 1118. https://doi.org/10.3390/plants15071118

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop