Identification of Insertion/Deletion Markers for Photoperiod Sensitivity in Rice (Oryza sativa L.)

Simple Summary Photoperiod sensitivity is important for rice breeders to enable crop adaptation to changing climatic conditions. This study aims to pinpoint candidate insertion/deletion (INDEL) markers linked to photoperiod sensitivity in indica rice cultivars. The INDEL marker can be utilized to explore the gene function in rice flowering, potentially facilitating gene editing using CRISPR/Cas9 technology. Applying this INDEL marker in breeding programs can enhance the genetic gain and economic efficiency of a targeted breeding strategy. Abstract The current study aims to identify candidate insertion/deletion (INDEL) markers associated with photoperiod sensitivity (PS) in rice landraces from the Vietnamese Mekong Delta. The whole-genome sequencing of 20 accessions was conducted to analyze INDEL variations between two photoperiod-sensitivity groups. A total of 2240 INDELs were identified between the two photoperiod-sensitivity groups. The selection criteria included INDELs with insertions or deletions of at least 20 base pairs within the improved rice group. Six INDELs were discovered on chromosomes 01 (5 INDELs) and 6 (1 INDEL), and two genes were identified: LOC_Os01g23780 and LOC_Os01g36500. The gene LOC_Os01g23780, which may be involved in rice flowering, was identified in a 20 bp deletion on chromosome 01 from the improved rice accession group. A marker was devised for this gene, indicating a polymorphism rate of 20%. Remarkably, 20% of the materials comprised improved rice accessions. This INDEL marker could explain 100% of the observed distinctions. Further analysis of the mapping population demonstrated that an INDEL marker associated with the MADS-box gene on chromosome 01 was linked to photoperiod sensitivity. The F1 population displayed two bands across all hybrid individuals. The marker demonstrates efficacy in distinguishing improved rice accessions within the indica accessions. This study underscores the potential applicability of the INDEL marker in breeding strategies.


Introduction
Rice (Oryza sativa) production is important in the global food supply, particularly in Asian countries.Rice is grown in diverse climatic regions with wide variation in their photoperiods [1,2].It is inherently a short-day plant, displaying pronounced photoperiod sensitivity, where short days encourage flowering while long days inhibit it [3].The heading date (flowering time) is an important trait for the regional and seasonal adaptation of rice and influences grain yield [4].Thus, photoperiod-sensitive control in rice is essential for the production of rice adaptive to different climatic conditions [5].
The application of single-nucleotide polymorphism (SNP) genotyping methods based on next-generation sequencing and high-throughput genotyping has gained popularity in various crops [25][26][27][28][29]. GWAS, facilitated by SNP analysis, enables the study of complex quantitative traits such as plant stress tolerance, growth, and development.
Insertion/deletion (INDEL) markers are valuable markers in rice breeding [30], aiding in distinguishing the phenotypic traits of rice varieties [31][32][33][34][35].However, existing studies have primarily utilized available INDEL markers from the Nipponbare genomic reference database.The distinction between photoperiod-sensitive and non-sensitive rice based on whole-genome sequencing and INDEL markers has not been a primary focus of previous studies.In this study, we employed the whole-genome sequencing of 20 indica rice accessions (including 16 landraces and four improved rice accessions) collected from the Vietnamese Mekong Delta provinces to analyze the relationship between two distinct groups based on INDEL information.

Plant Materials
A total of twenty rice accessions from a pool of 99 rice accessions were selected following the criteria outlined by Tam et al. [36].The selected accessions comprised sixteen landraces and four improved rice accessions originating from the Vietnamese Mekong Delta provinces (Table 1).The origin locations of the studied rice landraces are characterised by salinity-affected and rainfed agro-ecology.In contrast, the origins of the improved rice accessions are considered freshwater and irrigated agro-ecology.Genomic DNA was extracted using the DNeasy Plant Mini kit (Qiagen, Hilden, Germany), adhering to the manufacturer's protocol.Sample preparation of leaf specimens and DNA quality assessment procedures were followed [36].Sterilized seeds of each accession were placed in Petri dishes for germination.Three days after germination, Petri dishes were transferred into a growth chamber at 27 • C. In a period of 7 to 10 days after germination, leaves were harvested for DNA extraction.The quantity of DNA was measured using a Thermo Scientific NanoDrop 2000 spectrophotometer (Fisher Scientific, Hampton, NH, USA) with a volume of 1 µL.DNA quality of samples was checked by 1% (w/v) agarose gel electrophoresis.

Whole-Genome Sequencing
The whole genome of the selected varieties was sequenced using Illumina's nextgeneration sequencing approach.Library preparation and sequencing were performed at each step of the procedure.The workflow was as follows: (1) checking of DNA; (2) construction of libraries; (3) checking of libraries; (4) sequencing.
We performed two main methods of quality control for DNA samples: (1) agarose gel electrophoresis to test DNA degradation and potential contamination, and (2) Qubit 2.0 to quantify DNA concentration precisely.
Library construction and quality control: A total amount of 1.0 µg DNA per sample was used as input material for DNA sample preparation.Sequencing libraries were generated using NEBNextR DNA Library Prep Kit (Novogene corporation INC (823 Anchorage Place, Chula Vista, CA, USA)) following manufacturer's recommendations and indices were added to each sample.Genomic DNA was randomly fragmented to a size of 350 bp by shearing; then, DNA fragments were end-polished, A-tailed, and ligated with an NEBNext adapter for Illmina sequencing, and further PCR-enriched by P5 and indexed P7 oligos.PCR products were purified (AMPure XP system), and resulting libraries were analyzed for size distribution by an Agilent 2100 Bioanalyzer (Santa Clara, CA, USA) and quantified using real-time PCR.
Sequencing: Qualified libraries were fed into Illumina sequencers after pooling according to their effective concentration and expected data volume.

Data Processing
Processing of INDELs for the twenty varieties followed the protocol outlined by Tam et al. [36].Sequenced raw reads underwent filtering and sorting based on original sample names.Sequence trimming was executed to a length of 350 bp using a Trimmomatic with the following specified parameters: LEADING:19, TRAILING:19, SLID-INGWINDOW:30:20, AVGQUAL:20, and MINLEN:51.High-quality reads were mapped to the Nipponbare IRGSP1.0 reference genome using Bowtie2 [37], available in Galaxy (www.https://usegalaxy.org,accessed on 24 November 2023).Further filtering was performed using Picard in Galaxy, and alignments were adjusted around INDELs using the INDELRealigner tool in Genome Analysis Toolkit v2.8 [38].INDEL calling utilized the Unified Genotyper tool in GATK v2.8 [39], and the initial INDEL dataset was filtered based on the following parameters: missing call rate (MCR), 100%; minor allele frequency (MAF), 0.05; and heterozygosity rate (HR), 0.0.Subsequent dataset filtering included functions to separate accessions into two groups (landrace and improved), leading to the identification of 2240 INDELs for further analysis.

Identifying INDELs between Photoperiod-Sensitive and Non-Sensitive Rice Groups
The INDELs, derived from dataset filtering (2240 INDELs) between the two groups, with a minimum length of 20 bp, were specifically chosen to emphasize the distinct variations between landrace and improved rice.In a previous study, Adedze et al. selected INDEL sizes starting from 30 bp as agarose gel PCR-based markers [40].The selected INDELs (candidates) were employed for gene filtering using the Rice SNP-Seek Database (https://snp-seek.irri.org/,accessed on 7 December 2023).Candidate INDEL positions were utilized to search the Rice SNP-Seek Database using the Search/Genotype function.

Primer Design for Testing Improved Rice Varieties
Primer design, confirming the differences between landrace and improved groups, was utilized on the INDEL located on the gene in the Rice SNP-Seek Database.The primer, named Photoperiod-sensitivity (PS), was designed using Primer3Plus (http://primer3 plus.ut.ee/cgi-bin/primer3plus/primer3plus.cgi,accessed on 7 December 2023).The information began with the specified primers (5 ′ -GAGGGAGCTCTCCATCCTCT-3 ′ for the left primer and 5 ′ GCTTCAACTCGAGGCACTCT-3 ′ for the right primer), targeting the region on chromosome 01.This amplification product was 203 from the standard sequence referenced for the Nipponbare variety.

Validation of INDEL Candidate from the Rice SNP-Seek Database
The validation of potential INDEL candidates based on the Rice SNP-Seek Database (3K) is a pivotal step.This database is well known for SNP, INDEL, gene, and other genomic data from 3024 whole-genome sequencing accessions [41].Mansueto et al. categorized these accessions into twelve distinct sub-populations based on SNP and INDEL profiles, including admix, aro, aus, Ind1a, ind1b, ind2, ind3, indx, japx, subtrop, temp, and trop [41].In order to ascertain the credibility of INDEL candidates in our study, an investigation was conducted within the rice genome utilizing the 3K database.This involved the genomic positions of the INDEL candidates across the 3K dataset.The diversity of INDEL markers was assessed, focusing on the occurrence of Nipponbare reference genome sequences or deletions within each sub-population.

Data Analysis
The filtering process utilized TASSEL 5.2.50 (Trait Analysis by Association Evolution and Linkage) [42].Candidate genes were scrutinized using the QTL and gene database in the Rice SNP-Seek Database [41].The variations among the twelve rice groups in the Rice SNP-Seek Database, and the ratio of deletion accessions to total accessions was calculated for each group.
This study focused on insertions or deletions between the two groups, with a minimum length of 20 bp.This analysis revealed 22 INDEL markers, of which 14 were on Chr.01, 1 was on Chr.03, 2 were on Chr.04, 2 were on Chr.06, and 3 were on Chr.07 (Table 3).Additionally, only one INDEL marker (S01_14359423) involved an insertion; the rest were deletions.This implies that 21 INDEL markers differed from the Nipponbare genome (Japonica rice cultivar).Almost all genome types of the improved group differed from the genome reference, except for three INDEL markers (S01_14359423, S06_24284080, and S07_1972486) (Table 3).

Gene Related to Photoperiod Sensitivity
To identify candidate genes that are associated with photoperiod sensitivity, the locations of the 22 INDELs were examined on the website https://snp-seek.irri.org/(accessed 10 June 2023).Ten INDELs located in 10 genes were found in landrace and improved rice accessions.Each gene had one INDEL.Seven genes were located on chromosome 01, two on chromosome 04, and one on chromosome 07 (Table 4).The functions of these genes included MADS-box family genes, acyl-ACP thioesterase, DUF26 kinases, carboxyl-terminal peptidase, pentatricopeptide, CBS domain-containing protein, homeobox-associated leucine zipper, WD domain, G-beta repeat domain-containing protein, and CAMK_KIN1/SNF1/Nim1_like_AMPKh.1-CAMK, the latter of which includes calcium/calmodulin-dependent protein kinases.One gene (LOC_Os01g23780), encoding a putative regulator of rice flowering [43], was identified from 13373620 to 13374665 on chromosome 01.In this gene's position, the improved rice accession group exhibited a 20 bp deletion compared to the landrace accession group.

Differences in Photoperiod-Sensitivity Group in Gene LOC_Os01g23780
The MADS-box family gene LOC_Os01g23780, which may be involved in rice flowering [43], was located in the INDEL region between the landrace (photoperiod sensitive) and improved varieties (photoperiod insensitive).We therefore designed the primer based on Primer3Plus (http://primer3plus.ut.ee/cgi-bin/primer3plus/primer3plus.cgi,accessed on 7 December 2023), and we obtained the photoperiod-sensitive primer (PS primer) with the following information: left primer 5 ′ -GAGGGAGCTCTCCATCCTCT-3 ′ and right primer 5 ′ GCTTCAACTCGAGGCACTCT-3 ′ ).It occurred at positions 13373620 to 13373910 in chromosome 01, and produced two prominent diagnostic fragments of approximately 203 bp and 183 bp for landrace rice and improved rice, respectively.In this position, the improved rice accession group was deleted at 20bp from the normal sequence of Nipponbare references or indica landrace rice.
To confirm the phenotype diversity of two distinct photoperiod-sensitive groups, we designed a primer to screen the action of the gene LOC_Os01g23780.The results indicate that the PCR product of the landrace accession group was larger than that of the improved group at about 20bp (Figure 1).Each of the 16 landrace accessions exhibited a PCR product measuring 203 bp.In contrast, the four improved rice accessions displayed a band of 183 bp (Figure 1).In addition, Figure 2 illustrates that the F1 population resulting from the cross between Nang Thom Cho Dao and MTL372 exhibited two bands across all hybrid individuals.Specifically, the Nang Thom Cho Dao accession displayed a band of 203 bp, while MTL372 exhibited a band of 183 bp.
Phenotypic analyses of heading dates and agronomic traits were carried out on 20 accessions in Hong Dan district, Bac Lieu province (latitude: 9 • 29 ′ 20.6 ′′ N; longitude: 105 • 30 ′ 24.1 ′′ E) from September 2020 to March 2021.Among these accessions, the heading dates of 16 landraces ranged from 24 December 2020, to 31 January 2021 (Supplementary Table S1).In contrast, four improved accessions were not affected by short-day conditions, displaying heading patterns determined by their genetic traits (Supplementary Table S1).The improved-accession group displayed shorter plant heights in comparison to the landrace group (Supplementary Table S1).
while MTL372 exhibited a band of 183bp.
Phenotypic analyses of heading dates and agronomic traits were carried out on 20 accessions in Hong Dan district, Bac Lieu province (latitude: 9°29′20.6′′N; longitude: 105°30′24.1′′E) from September 2020 to March 2021.Among these accessions, the heading dates of 16 landraces ranged from 24 December 2020, to 31 January 2021 (Supplementary Table S1).In contrast, four improved accessions were not affected by short-day conditions, displaying heading patterns determined by their genetic traits (Supplementary Table S1).The improved-accession group displayed shorter plant heights in comparison to the landrace group (Supplementary Table S1).while MTL372 exhibited a band of 183bp.
Phenotypic analyses of heading dates and agronomic traits were carried out on 20 accessions in Hong Dan district, Bac Lieu province (latitude: 9°29′20.6′′N; longitude: 105°30′24.1′′E) from September 2020 to March 2021.Among these accessions, the heading dates of 16 landraces ranged from 24 December 2020, to 31 January 2021 (Supplementary Table S1).In contrast, four improved accessions were not affected by short-day conditions, displaying heading patterns determined by their genetic traits (Supplementary Table S1).The improved-accession group displayed shorter plant heights in comparison to the landrace group (Supplementary Table S1).The INDEL marker S01_13373751, located in the gene LOC_Os01g23780, was employed to investigate the genetic diversity within the rice genome via the Rice SNP-Seek database (3K).It revealed that five sub-populations (aro, japx, subtrop, temp, and trop) exhibited identical genotypes to those of the Nipponbare and landrace groups (Table 5).Therefore, this INDEL marker distinguishes landrace and improved groups and separates the genotype of the five subgroups above from the improved indica rice group (Table 5).

Discussion
Culturing improved rice varieties with photoperiod insensitivity and a short growth duration has been prominent in Vietnam and other Asian countries, driven by agricultural intensification for enhanced food security and export [44].Photoperiod sensitivity is important to rice breeders [45].In the Vietnamese Mekong Delta, since the 1980s, improved rice varieties have gradually replaced rice landraces with photoperiod sensitivity and long growth durations [46].The key distinction between landrace and improved rice lies in their responsiveness to day length, impacting crucial aspects such as flowering, grain yield, and overall rice productivity.To investigate photoperiod sensitivity, numerous earlier studies have concentrated on the heading date [3,6,45] because rice typically utilizes photoperiodism to regulate the timing of flowering.The progression of rice development leading up to floral initiation is characterized by two consecutive phases: the basic vegetative growth phase, which is photoperiod-insensitive, and the photoperiod-sensitive phase [47].
INDEL polymorphisms are the second most prevalent type of genetic variation, following SNPs, in humans and plants [48].In addition, INDEL markers easily distinguish the genotypes among individuals based on their size.Therefore, developing INDEL markers is becoming popular for crop genetic studies [48].
In  [41].Fourteen INDELs in Chr.01 exhibited insertions or deletions of at least 20 bp, suggesting their potential utility for distinguishing between the two groups.An INDEL at the position of 13373751 bp on chr.01, involving a 20 bp deletion in the improved rice group compared to the landrace group, was identified.This INDEL resides in the codon of the gene LOC_Os01g23780, classified as OsMADS95-MADS-box, a gene known for its involvement in floral organ development, flower development, and overall plant growth and development [43].LOC_Os01g23780 plays a crucial role during the reproduction phase of rice [43].
The INDEL candidate marker for the gene LOC_Os01g23780 successfully distinguished landrace and improved rice groups.Utilizing this INDEL marker allows breeders to assess the outputs of crosses between landrace and improved rice parents.Successful crossings can be identified by the presence of two bands in the F1 individual using the PS marker.Additionally, applying this marker can enhance the genetic gain and economic efficiency of a targeted breeding strategy [49].Furthermore, this INDEL marker exhibited broader applicability, enabling the classification of five subgroups (aro, japx, subtrop, temp, and trop) within the indica improved rice category (Table 5).This finding underscores the potential significance of the identified INDEL marker in both distinguishing and categorizing rice accessions, providing valuable insights for further breeding programs and genetic studies.

Conclusions
This study was undertaken to elucidate the genetic distinctions, specifically the number of INDELs, between landrace and improved rice groups.A significant discovery is that a substantial proportion of the identified INDELs, totaling almost 2240, were concentrated on chromosome 01.In addition, the specific INDEL position S01_13373751 bp within the gene LOC_Os01g23780 emerged as a focal point of significance.
The outcomes of the current study lend strong support to the proposition that the INDEL marker S01_13373751 holds practical utility.This marker demonstrates efficacy in distinguishing improved rice accessions within the indica accessions.Further, it serves as a discriminative tool between improved rice accessions originating from indica and Japonica rice accessions.This study underscores the potential applicability of the INDEL marker in breeding strategies.

Figure 2 .
Figure 2. Gel electrophoresis image illustrating locus phase amplification patterns in the F1 population.Rice landrace produces a PCR product with a size of 203 bp and improved rice produces a PCR product with a size of 183 bp, whereas F1 population has bands with sizes of 203 bp and 183 bp for

Figure 2 .
Figure 2. Gel electrophoresis image illustrating locus phase amplification patterns in the F1 population.Rice landrace produces a PCR product with a size of 203 bp and improved rice produces a PCR product with a size of 183 bp, whereas F1 population has bands with sizes of 203 bp and 183 bp for

Figure 2 .
Figure 2. Gel electrophoresis image illustrating locus phase amplification patterns in the F1 population.Rice landrace produces a PCR product with a size of 203 bp and improved rice produces a PCR product with a size of 183 bp, whereas F1 population has bands with sizes of 203 bp and 183 bp for whole hybrid population.1: Marker labels; 2: Nang Thom Cho Dao (rice landraces); 3: MTL372 (Improved rice); 4-12: F1.

Table 1 .
Rice accession codes and names, origin provinces, and photoperiod sensitivity.

Table 2 .
Number of INDELs identified in rice accessions based on 20 whole-genome sequences.

Table 3 .
Different characteristics of INDELs with minimum lengths of 20 bp between two accession groups.

Table 4 .
Comprehensive INDEL information for the 20 selected accessions.

Table 5 .
Number of accessions displaying deletion at position 13373751 on chromosome 01 from the 3K database.
our study, identifying 2240 INDELs from the genome sequencing of 20 accessions, representing two distinct groups of landrace and improved rice, revealed significant differences.Chr.01 exhibited the highest discrepancy, with 1263 INDELs, followed by chr.04 (220 INDELs), 06 (171 INDELs), and 03 (123 INDELs), collectively accounting for 79.3% of all INDELs (1777 INDELs).The information on these INDELs can be found in the 3K database, where the highest number of INDELs is in chr.01