InDel Markers Based on 3K Whole-Genome Re-Sequencing Data Characterise the Subspecies of Rice ( Oryza sativa L.)

: A molecular marker is a valuable tool in genetic research. Insertions–deletions (InDels) are commonly used polymorphisms in gene mapping, analysing genetic diversity, marker-assisted breeding, and phylogenetics. The 3000 Rice Genome Project, a re-sequencing project, discovered millions of genome-wide InDels. We found that the proportion of >50-bp long InDels (699,475) of the total (1,248,503) is 56.02%. The number of InDels on each chromosome was consistent with the corresponding chromosome length. The maximum InDels were on chromosome 1 (78,935), and the minimum InDels were on chromosome 9 (41,752), with an average density of 1.87 InDels/kb (range: 1.50–2.36 InDels/kb). Furthermore, 96 InDels of about 3.98 Mb/InDel were selected to detect the polymorphism. The results exhibited ideal performance in 2% agarose gel electrophoresis. Phylogenetic analysis exhibited that InDel markers had excellent polymorphisms between rice varieties of japonica and indica, and varieties could be classiﬁed based on the statistical results of their polymorphisms. The InDel markers could be applied to identify the recombinant inbred lines in a population. These results reveal that the high-density long InDel markers could help us examine the functional diversity, species variation, and map-based cloning.


Introduction
Rice (Oryza sativa) feeds more than a half of the global population. Globally, approximately two billion people chronically suffer from hunger or food shortage [1]. With the increasing population and environmental destruction, food security problems are becoming quite severe. Therefore, increasing the yield is important in cereal crop research and breeding. As an effective tool, various molecular markers including restriction fragment length polymorphisms (RFLP), random amplified polymorphic DNA (RAPD), microsatellite or simple sequence repeat (SSR), single nucleotide polymorphisms (SNPs), and insertions-deletions (InDels) have been successfully developed and applied widely since the 1980s [2][3][4][5][6][7].
As the first generation of molecular marker technology, RFLP plays an important role in inchoate molecular breeding and genetic research. However, it is based on southern blotting, with drawbacks of a tedious operating process, long detection period, and high cost in large-scale molecular breeding [8]. RAPD is a molecular marker technology that requires a random primer (8-10 bases) to amplify DNA fragments by using polymerase chain reaction (PCR). However, it has limitations associated with gene mapping and linkage mapping because it only applies to dominant inheritance and cannot recognise Agriculture 2021, 11, 655 2 of 15 heterozygous loci [9]. SSR markers mainly comprise 2-4 repeat sequences, which are widely distributed in different locations of the genome. Owing to the high stability and repeatability, this method is widely used in rice research [2,10]. SNP refers to the insertion, deletion, or substitution of a base between different alleles at the same locus [11]. SNP has higher polymorphism and frequency than SSR molecular marker in the biological genome, which is widely applied in genotyping arrays, molecular marker-assisted breeding, and identification of germplasm resources [12].
With the continuous completion of plant genome DNA sequencing and the rapid development of comparative genomics research, extensive biological information has become available for plant research and genetic breeding [13]. InDels are widely present in plant genomes and are mainly caused by transposable element movement, replication, and slip and unequal exchange within the genome [14,15]. Using the basic local alignment search tool (BLAST) to thoroughly compare the total genomic DNA sequences of the same species in different individuals or related species of the same species, many InDels, ranging from 1 bp to 1000 bp, were identified. More than 99% of the InDels were <50 bp in length, with an average length of 36 bp [15][16][17]. According to the analysis of InDel site sequence, InDels are grouped into five clades: (1) InDels of single-base pairs, (2) monomeric base-pair expansions, (3) multibase-pair expansions of 2-15 bp repeat units, (4) transposon insertions, and (5) InDels containing random DNA sequences [17]. InDels are widely distributed in the genome. The distribution and density of InDels are next to those of SNPs and much higher than those of SSR [18]. In the human genome, the average InDel density within genes is the presence of an InDel per 6.3 kb, which is similar to the average InDel density of the whole genome (an InDel per 7.2 kb) [17]. The average density in the rice genome is approximately an InDel per 953 bp [13]. InDel markers have been widely used for genetic studies in wheat, Arabidopsis, citrus, and rice [19][20][21][22]. As a membrane protein, ALMT1 facilitates an aluminium-stimulated malate efflux. Researchers have developed repetitive InDel markers (ALMT1-SSR3a and ALMT1-SSR3b) to screen 20 diverse wheat genotypes and identified six allele variants at the ALMT1SSR3 locus [19]. InDels have been used to successfully distinguish the seven common Arabidopsis accessions, and more than 35 mutations in the Col-03×Ws-4 combination were characterised [20]. In rice, InDel markers have been applied in the classification of nine blast resistance genes at the Piz, Piz-t, Pit, Pik, Pik-m, Pik-p, Pita, Pita-2, and Pib loci [22]. Based on the genome sequence of Nipponbare (Oryza sativa, Japonica) and 93-11 (Oryza sativa, Indica), 479,406 InDels were identified, and the correlation between InDels and SSR markers was analysed, which indicated that InDels would be very useful for rice gene cloning [13].
With advances in next-generation sequencing technologies (NGS), studies on rice have significantly improved our knowledge of the genome-wide genetic variation [23]. The Chinese Academy of Agricultural Sciences, BGI-Shenzhen, and the International Rice Research Institute conducted the 3000 Rice Genome Project (3K-RG), which is a resequencing project. Many SNPs and InDels that would help in rice genomics research and breeding have been identified [24][25][26]. Pid3, as a rice blast resistance gene, 71 polymorphic loci, and 40 haplotypes were identified using the 3K-RG sequencing. A new allele was found, which provided a new idea for rice breeding: the use of new disease resistance genes [27]. Soil salinity is the major factor affecting the growth and yield of rice. To identify new salt tolerance genes, researchers used 664 cultivated rice varieties from the 3K-RG sequencing for salt tolerance treatment combined with whole-genome association analysis and haplotype analysis, which laid a foundation for the molecular breeding and functional analysis [28]. Kaur et al. (2020) identified the relationship between 12-bp InDel and plant salt tolerance by using the SNP-Seeking database and found that the 12-bp InDel can be used for differentiating between indica and japonica varieties [29,30]. In the past several decades, the origin of rice domestication has been controversial. Research on genetic diversity of rice originated from two hypotheses of domestication: (1) indica and japonica rice arose from two independent domestication events and (2) indica rice was developed from crosses between japonica rice and wild rice [31][32][33]. Based on the Agriculture 2021, 11, 655 3 of 15 3000 rice varieties, 32 retrotransposon families were successfully identified using the newly developed TRACKPOSON software, and the plant origin was detected through the insertion of retrotransposons. The authors concluded that rice may have originated from three independent events [34].
The 3K-RG re-sequencing project has identified 29 million SNPs and 2.4 million small InDels [24]. The data are mainly focused on genome variation, allele variation, and haplotype analysis [25,26,28]. However, the distribution and polymorphism of many InDels in indica and japonica rice and the polymorphism of these markers in different rice varieties are unclear. In this study, we identified 699,475 InDels of >50 bp in length and randomly selected 96 InDels from 12 chromosomes for verification. The results revealed a high polymorphism in different tested varieties of indica and japonica rice, and a set of near-isogenic lines could be well distinguished by InDel markers. These markers displayed better ability in distinguishing various rice resources with different backgrounds compared with other markers and contributed to revealing the genetic basis of biodiversity.

Plant Materials and Cultivation
A panel of 98 rice cultivars, including 36 japonica-type rice cultivar, 36 indica-type rice cultivar, and 26 recombinant inbred line progenies and their corresponding parents, were used in this study (Tables S1 and S2). Among these cultivars, 36 japonica-type rice cultivars were used to analyse InDel identification of varieties, 36 indica-type rice cultivars were used to analyse the divergence of varieties, and 26 recombinant inbred line progenies were used to identify the genetic background. Some of these rice accessions were provided by IRRI and were planted in Wuhan under proper management.

Data Sources of 3024 Rice Genomic Sequences
The genomic sequence data of 3024 rice cultivars and the InDel information (Deletions called vs. Nipponbare MSU7/IRGSP1.0 genome and Insertions, called vs. Nipponbare MSU7/IRGSP1.0 genome) were downloaded from the website of Rice SNP-Seek Database (http://iric.irri.org, accessed on 22 August 2020). Using private Python scripts, we extracted data of >50-bp-long InDel fragments for further analyses. The Nipponbare genome (IRGSP-1.0) was also downloaded from Ensembl Plants. Information regarding 72 cultivars of 3000 Rice Genome Project was downloaded from the Rice Functional and Genomic Breeding website (RFGB, http://www.rmbreeding.cn, accessed on 25 August 2020). The cultivar information is listed in Supplementary Table S1.

PCR Primer Design
To design primers for PCR validation, we extracted 250-900-bp long sequences, including a 50-500-bp InDel region and flanking sequences of 100-200 bp on both sides of the InDel region. Primers were designed using Primer 5 software, with a primer length of 18-22 nt and an optimal length of 20 nt. Their GC content, melting temperature (Tm), and the optimum temperature were 45-60%, 55 • C-65 • C, and 58 • C, respectively. The difference between the refolding temperature of the paired primers was no more than 5 • C. The expected PCR amplification product length was 250-1000 bp. Primer dimer, hairpin structure, and mismatch were avoided in the designing process. All operations were performed using custom Perl scripts. The primers are listed in Supplementary Table S3.

DNA Extraction and PCR Amplification
According to the CTAB method, genome DNA was extracted from young leaves of all rice cultivars [35]. PCR was performed in a reaction volume of 25 µL, which included approximately 50 ng of template DNA, 12.5 µL of 2× PCR mixture buffer, 2 µL of 10 nM primer, and ddH 2 O. The DNA amplification scheme included initial denaturation at 95 • C for 5 min, followed by 35 cycles of denaturation at 95 • C for 30 s, annealing at 55 • C-65 • C for 30 s, extension at 72 • C for 60 s, and a final extension at 72 • C for 2 min. The reaction was performed in LifeECO thermal cycler. The PCR products were separated using 2.0% agarose gel electrophoresis.

Phylogenetic Analysis
Each band produced by the InDel primers was treated as a unit character and scored as a binary code (1/0). The phylogenetic trees were analysed using DPS 7.5 based on the 1/0 matrix [36], and the genetic relation was assayed using MEGA7 [37], which is based on the unweighted pair group method and arithmetic mean (UPGMA) cluster analysis.

Identification and Distribution of InDels in the 3000 Rice Genome Dataset
Based on the next-generation sequencing data from 3024 rice varieties, the InDel information was downloaded from the Rice SNP-Seek Database website. A total of 1,248,503 InDels were evenly distributed across all 12 chromosomes ( Figure 1A). The results revealed that the number of InDels on each chromosome was generally consistent with the corresponding chromosomal length; the largest amount (137,443) was noted on chromosome 1, and the least amount (77,000) was on chromosome 9 ( Figure 1B, Table 1). The average density was 3.34 InDels/kb, with a range of 2.78-4.07 InDels/kb on 12 chromosomes. The lowest and highest density was on chromosomes 3 and 11, respectively. ( Figure 1C, Table 1). We used the Python script to retrieve sequences (699,475) of >50-bp long InDels, which was also consistent with the previous analysis. The maximum number (78,935) of >50-bp long InDels was located on chromosome 1, whereas the minimum number (41,752) was located on chromosome 9; the density of InDels was not the same as before, with the highest density on chromosome 11 (2.36 InDels/kb) and the lowest density on chromosome 3 (1.50 InDels/kb) ( Figure 1B,C, Table 1). The results suggested that many InDels were widely and evenly distributed on each chromosome, which implied that InDels can be developed as a high-density molecular marker.

Application of New InDel Markers in Nipponbare
To facilitate detection by agarose gel electrophoresis, we selected 50-900-bp long InDels and designed the primers at the position 100-200 bp upstream and downstream ( Figure 2A). According to the average distribution on chromosomes, we randomly designed 96 primers of approximately 3.98 Mb/InDel ( Figure 2B, Table S3).
We used Nipponbare (Oryza sativa Japonica) as a control to detect the amplification efficiency of 96 pairs of primers. The products could be amplified by all primers between 250 bp and 1000 bp and easily detected by 2% agarose gel electrophoresis ( Figure 2C). These results indicated that InDels exists widely in the genome and exhibits a great application potential in plant research.

InDel Marker Could Be Applied for Accurate Identification of Varieties in Japonica Rice
To assess the value of InDel markers for genetic analysis of varieties in japonica rice, we screened 36 japonica rice varieties, which originated from 15 countries, for polymorphism analysis using InDel markers (Table S1). The PCR products ranged from 500 bp to 1000 bp in 2% agarose gel (Figures 3a and S1). The results also indicated that the selected 36 japonica rice varieties had polymorphic profiles (Figures 3a and S1).

InDel marker Could Be Applied for Accurate Identification of Varieties in Japonica Rice
To assess the value of InDel markers for genetic analysis of varieties in japonica rice, we screened 36 japonica rice varieties, which originated from 15 countries, for polymorphism analysis using InDel markers (Table S1). The PCR products ranged from 500 bp to 1,000 bp in 2% agarose gel ( Figure 3A and Figure S1). The results also indicated that the selected 36 japonica rice varieties had polymorphic profiles ( Figure 3A and Figure S1).
Using the UPGMA approach, a dendrogram was constructed based on the product length of 26 InDel markers in 36 different rice varieties ( Figure 3B). We could divide the materials into three monophyletic groups when the similarity value was only 0.0474, and the groups were designated as GJ-I, GJ-II, and GJ-III ( Figure 3B). GJ-II comprised two subgroups: M1 and M2. M1 comprised IRIS_313-10111, IRIS_313-10097, IRIS_313-10373, and IRIS_313-10798 ( Figure 3B). IRIS_313-10373 and IRIS_313-10798 originated from Southeast Asia and were geographically related. They were well clustered into the same species sub-   Using the UPGMA approach, a dendrogram was constructed based on the product length of 26 InDel markers in 36 different rice varieties (Figure 3b). We could divide the materials into three monophyletic groups when the similarity value was only 0.0474, and the groups were designated as GJ-I, GJ-II, and GJ-III (Figure 3b). GJ-II comprised two subgroups: M1 and M2. M1 comprised IRIS_313-10111, IRIS_313-10097, IRIS_313-10373, and IRIS_313-10798 (Figure 3b). IRIS_313-10373 and IRIS_313-10798 originated from Southeast Asia and were geographically related. They were well clustered into the same species sub-type by using those markers (Figure 3b, Table S1). Similarly, M2 contained ZH11, Balila, and IRIS_313-10059. IRIS_313-10059 and ZH11 were from East Asia, and their genetic relationship was also extremely close (Figure 3b, Table S1). GJ-III mainly included two subtypes: N1 and N2. According to the polymorphism among species, the N2 sub-type could be divided into two types, namely N21 and N22, and N22 could be divided into three different groups, namely N221, N222, and N223. IRIS_313-10083 and IRIS_313-10272 in M21 were from Europe, which were geographically close (Figure 3b, Table S1). N222 could be divided into many small branches according to the polymorphism of InDels, such as IRIS_313-11463, IRIS_313-11438, IRIS_313-9969, IRIS_313-11622, IRIS_313-11433, IRIS_313-11458, IRIS_313-11510, IRIS_313-11570, IRIS_313-8073, IRIS_313-11735, IRIS_313-11508, and IRIS_313-11844, which could be grouped into three small branches. Among them, IRIS_313-11433, IRIS_313-11458, IRIS_313-11570, and IRIS_313-11735 were all from the same area in different small branches (Figure 3b, Table S1). Based on these results, our InDel markers could distinguish japonica rice into different varieties. InDels showed rich polymorphisms in different japonica rice varieties, which could help us reveal the origin of distinct species to a certain extent and facilitate differentiation between species.

InDel Markers Might Contribute to the Divergence of Varieties in Indica
We tested the polymorphism of these InDel markers to assess whether it can be used to distinguish indica rice. We selected 36 indica rice varieties from 14 countries and used 25 markers for further analysis (Table S1). The PCR product length ranged from 500 bp to 1000 bp, which was detected using 2% agarose gel (Figure 4a, Figure S2 and Table S3). The results revealed that the 36 selected indica rice varieties exhibited rich diversity (Figure 4a, Figure S2 and Table S3).

InDel Markers Could Be Applied to Identify the Recombinant Inbred Lines in a Population
We found that InDel markers could effectively identify the polymorphisms of indica and japonica rice. To study whether these markers can distinguish the recombinant inbred line (RIL) population, we selected an RIL derived from the cross between 9311 and YouA and used 32 progenies of recombinant group inbred line. PCR amplified polymorphisms indicated the abundant diversity among the progenies ( Figure 5A, Figure S3 and Table S3).
We found that InDel markers could effectively identify the polymorphisms of indica and japonica rice. To study whether these markers can distinguish the recombinant inbred line (RIL) population, we selected an RIL derived from the cross between 9311 and YouA and used 32 progenies of recombinant group inbred line. PCR amplified polymorphisms indicated the abundant diversity among the progenies ( Figure 5A,S3 Table S3).
The UPGMA method was used to construct a phylogenetic tree for further analysis. Result showed that the selected 32 progenies of recombinant inbred line could be grouped into two types according to their parents 9311 and YouA: RIL-1 and RIL-2. RIL-1 includes X12 and X22 as well as parent YouA ( Figure 5B). Other progeny, except X-10, and parent 9311 were classified as RIL-2. Among them, 1880, which has a similar background as that of 9311, is also classified as RIL-2 ( Figure 5B). These results indicated that the RIL could be distinguished by the InDel markers and the InDel markers exhibited great potential in identifying the genetic background of rice varieties. The UPGMA method was used to construct a phylogenetic tree for further analysis. Result showed that the selected 32 progenies of recombinant inbred line could be grouped into two types according to their parents 9311 and YouA: RIL-1 and RIL-2. RIL-1 includes X12 and X22 as well as parent YouA ( Figure 5B). Other progeny, except X-10, and parent 9311 were classified as RIL-2. Among them, 1880, which has a similar background as that of 9311, is also classified as RIL-2 ( Figure 5B). These results indicated that the RIL could be distinguished by the InDel markers and the InDel markers exhibited great potential in identifying the genetic background of rice varieties.

Discussion
As an effective tool for studying genetics, molecular markers have been well improved and widely applied in the analysis of genetic diversity, determination of variety identity, marker-assisted breeding, and phylogenetics [38]. Compared with traditional markers, such as AFLP, RFLP, and SSR, InDels have the advantages of wide distribution, high density, large number, and easy detection [21]. Owing to the rapid development of next-generation sequencing technologies, many InDels have been identified [20]. In rice, the implementation of the 3K-RG re-sequencing project can help in identifying InDels and analysing species variation [24,26]. Among the 3024 sequenced rice varieties, we identified 1,248,503 InDels and screened out >50-bp long 699,475 InDels, which were widely distributed on 12 chromosomes with an average density of 1.87 InDel/kb ( Figure 1B, Table 1). The number of InDels on each chromosome was consistent with the corresponding chromosome length; the largest amount was on chromosome 1 and the least amount was on chromosome 9. The frequency of >50-bp long InDels was 1.5-2.36 InDels/kb on chromosome 12. ( Figure 1B, Table 1). The higher frequency and amplification efficiency of InDels in rice help in utilising InDels for genetic diversity and marker screening [39]. We designed 96 InDels evenly distributed across all 12 chromosomes with a physical distance of approximately 3.98 Mb/InDel. The results suggest that the primers amplified well in Nipponbare and widely existed in the genome (Figure 2A,B, Table S3).
Owing to long-term cultivation and evolution under diverse agroecological conditions, Asian cultivated rice is differentiated into varieties of indica and japonica and has, consequently, developed reproductive isolation [40,41]. To determine the features of indica and japonica rice, the Cheng's index containing six selected morphological and physiological characteristics is most commonly used [42]. However, this method has many limitations associated with elimination of environmental factors and identification of RILs with complex genetic backgrounds. The genetic background differences between the varieties of indica and japonica are small; therefore, identification is difficult. With the development of the molecular marker technology, these limitations have been largely resolved. Analysis of a set of rice germplasm (including 166 Asian rice varieties, 2 African rice varieties, 30 accessions of wild rice species, and 42 weedy rice accessions) implied that the differentiation of indica and japonica rice by using the SSIF comprised 40 InDel markers [43]). In a previous study, by analysing the entire genomic sequences of indica (9311) and japonica rice (Nipponbare), 45 typical InDel markers were obtained and 34 markers were found to be closely related to the differentiation of indica and japonica rice [44]. The 3K-RG re-sequencing project provides massive data from rich sources, which makes it possible to solve this dilemma by identifying many InDels [24,26]. Here, we selected 36 different japonica rice varieties from 15 countries for further analysis (Table S1). The results revealed that these varieties were well divided into three monophyletic groups, GJ-I, GJ-II, and GJ-III, by 26 InDel markers (Figure 3). Among these groups, GJ-II and GJ-III could be further divided into different subtypes continually, especially GJ-III could continue to be divided into N1 and N2, N21 and N22, and N221 and N222 according to the polymorphism of InDel markers ( Figure 3). These results exhibited that InDel markers harboured good polymorphisms in the varieties of japonica rice, and japonica rice could be preliminarily classified based on the statistical results of its polymorphism. IRIS_313-10083 and IRIS_313-10272, IRIS_313-11433 and IRIS_313-11458, and IRIS_313-11570 and IRIS_313-11735 were from the same area but were grouped into different branches ( Figure 3). Therefore, high-density InDel markers are of great significance in identifying the diversity and consistency of japonica rice varieties.
We selected eight varieties from unknown countries for the polymorphism analysis and found that these varieties could be grouped in different branches ( Figure 3). Therefore, by studying the relatedness with the nearby varieties, the possible location of origin of these varieties can be judged, which provides a powerful tool for the protection and utilisation of our variety diversity. Moreover, these polymorphisms also existed in the varieties of indica rice. Using 25 specific InDel markers, through the analysis of 36 indica rice from 14 countries, it was found that indica rice could also be divided into three groups, XI-I, XI-II, and XI-III. XI-II and X-III subtypes could be divided into many small branches (Figure 4, Table S2). It shows that the above markers were also widely present in the differentiation of indica rice. InDel markers have provided great opportunities to examine functional diversity, and with the advancement of more InDel markers in major crop species, significant genetic diversity studies can be conducted.
The InDel molecular markers can be used to determine the characteristics of indica and japonica, and this is convenient for breeders to avoid similarities in selecting varieties [44]. We selected a series of RILs derived from the cross between 9311 and YouA for polymorphism analysis. The results indicated that the progeny of RILs can be divided into two types according to the parents 9311 and YouA, and this result can be used to understand which parent it comes from ( Figure 5). Interestingly, as we found that the X-10 is in the outer group, we speculate that the previous generation of parents may not be critically self-pollinated, but pollinated with other varieties. It also confirmed that our indel markers can help us to identify the homozygosity of offspring and provide a reliable way to construct novel materials ( Figure 5B). When using markers to analyse the polymorphism of the selected varieties, we found heterozygous bands in multiple varieties (Figure 3a, Figure 4a and 5A). These results mean that the varieties were not homozygous, which were still needed to self-cross and selection by molecular markers. Through the development of the above InDel markers, the homozygosity of a variety can be identified, which will be beneficial in experiments.
Traditionally, the length of most of the identified InDels is 2-50 bp, and the average length is 36 bp [17,39]. PCR amplification can be used to only identify InDels through polyacrylamide gel electrophoresis, although this method is time-consuming and labourintensive. Acrylamide and bis-acrylamide are harmful to the human body. Here, we identified 699,475 InDels of >50 bp in length, accounting for 56.02% of all InDels (Table 1). Long fragments of InDels are more convenient for detection. All our InDel products could be easily checked using 2% agarose gel electrophoresis, which greatly reduced the detection cost. DNA polymorphism is the foundation of molecular marker development, which is widely used in gene mapping [13]. InDel-based PCR markers are the most commonly used methods [20]. In Arabidopsis thaliana, in the whole-genome sequence analysis of 1135 accessions from a worldwide hierarchical collection, many InDels were identified based on the genomic sequence variation, which promoted the wide application of this marker technology in map-based cloning [20,45]. In rice, the high-density long InDel fragments identified through the 3K-RG re-sequencing project promoted the mapbased cloning of essential functional genes in rice [24,26,34]. While validating InDel polymorphisms in the database, we further designed 182 pairs of long fragments of InDel markers, covering 15-18 markers on each chromosome (Table S4). These genome-wide markers will contribute to rice research, including map-based cloning and germplasm identification.

Conclusions
Owing to the advantages of wide distribution, high density, large number, and easy detection, InDel makers have attracted considerable attention. In this study, we identified 1,248,503 InDels and screened out >50-bp length 699,475 InDels based on 3K wholegenome re-sequencing data. We found that the number of InDels on each chromosome was consistent with the corresponding chromosome length, and 96 InDels of approximately 3.98 Mb/InDel exhibited ideal performance in 2% agarose gel electrophoresis. Phylogenetic analysis revealed that InDel markers had excellent polymorphisms between rice varieties of japonica and indica, and these varieties could be classified based on the statistical results of their polymorphisms. Furthermore, the InDel markers could be applied to identify the RILs in a population. We have further designed 108 pairs of long fragments of InDel markers with the view that these genome-wide markers will contribute to further studies on functional diversity, species variation, and map-based cloning in rice varieties.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/agriculture11070655/s1, Figure S1: PCR amplification patterns of the InDel markers that could discriminate varieties in japonica rice. Figure S2: PCR amplification of 36 indica rice varieties using the 26 InDel markers. Figure S3: PCR amplification of the recombinant inbred lines (RILs) population. Table S1. The indica and japonica rice varieties used in this study. Table S2. The recombination inbred line progenies used in this study. Table S3. Primer sequences for 96 InDel markers used in this study. Table S4. Primer sequences for an additional 108 new InDel markers.  Data Availability Statement: Data generated and analyzed during this stud is contained in this manuscript.