Detection of Genomic Regions Controlling the Antioxidant Enzymes, Phenolic Content, and Antioxidant Activities in Rice Grain through Association Mapping

Because it is rich in antioxidant compounds, the staple food of rice provides many health benefits. Four antioxidant traits in rice grain, viz., catalase, CUPRAC, DPPH, FRAP and peroxidase, were mapped in a representative panel population containing 117 germplasm lines using 131 SSR markers through association mapping. Donor lines rich in multiple antioxidant properties were identified from the mapping population. The population was classified into three genetic groups and each group showed reasonable correspondence with the antioxidant traits. The presence of linkage disequilibrium in the population was confirmed from the estimated Fst values. A strong positive correlation of DPPH was established with TPC, FRAP and CUPRAC. A moderate to high mean gene diversity was observed in the panel population. Eleven significant marker-trait associations for antioxidant traits were mapped, namely, qACD2.1, qACD11.1 and qACD12.2 for DPPH; qCAT8.1 and qCAT11.1 for catalase; qFRAP11.1, qFRAP12.1 and qFRAP12.2 for FRAP; and qCUPRAC3.1, qCUPRAC11.1 and qCUPRA12.1 regulating CUPRAC. Co-localization of the QTLs for qACD11.1, qFRAP11.1 and qCUPRAC11.1 were detected, which may act as antioxidant hotspots regulating DPPH, FRAP and CUPRAC activities, respectively, while qACD12.2 and qFRAP12.1 remained close on the chromosome 12. These detected QTLs will be useful in antioxidant improvement programs in rice.


Introduction
Antioxidants protect the plant cell from damage and act as a defense system for maintaining the structural and functional integrity of cell [1,2]. Antioxidants also influence the seed viability, vigor and longevity by preventing seed deterioration [2][3][4]. Further, several antioxidants of rice have impressive health benefits [5][6][7]. Consumption of whole grain rice enriched with antioxidant compounds improves human health by reducing the risk of a number of chronic diseases [8,9]. Rice is used as a staple food for more than two billion people globally, and enriching grain with antioxidant compounds may be the best option to utilize rice as a health-promoting food. Moreover, improvement in antioxidants will lead to development of superior quality seeds, which will enhance rice production because good seed is a basic and vital input for crop production [10]. In addition, rice seed enriched with antioxidants (catalase and peroxidase) also enhances resilience in rice crop to stress situation [11][12][13][14]. The antioxidant traits are complex, polygenic in nature and quantitatively inherited [15]. There is a need to develop molecular markers for enhancing these phytochemicals in rice through molecular breeding approach.
In rice, ascorbate peroxidase (APx) was reported to be encoded by eight APx genes: two cytosolic isoforms encoded by OsAPx1 and OsAPx2, two peroxisome/glyoxysome isoforms encoded by OsAPx3 and OsAPx4, and chloroplastidic isoforms encoded by OsAPx5, Os-APx6, OsAPx7 and OsAPx8 [13,16]. The catalase (CAT) is encoded by a small family of three genes: CatA, CatB and CatC [13,17]. Reports on suitable marker loci for identifying these genes for catalase (CAT) and peroxidase (PEROX) are not available. However, several reports on QTL mapping are available on total phenolics content (TPC) and activity of antioxidants such as DPPH (2,2'diphenyl picryl hydrazyl), FRAP (ferric reducing antioxidant power) and CUPRAC (cupric reducing antioxidant capacity) [8,10,[18][19][20]. For better understanding of these complex traits, more genes/loci need to be identified that will lead to development of trait-specific markers, which will accelerate the efforts to breed highyielding antioxidant-rich rice varieties. The present study mainly aimed at QTL mapping of antioxidant traits, such as CAT, PEROX, TPC, DPPH, FRAP and CUPRAC activity, in rice.
Association mapping has emerged as a powerful alternative strategy for identifying genes or QTLs for various complex traits in plants in a natural variable population by examining the marker-trait associations. Mapping of complex antioxidant compounds and antioxidant activity by exploiting the naturally occurring variations through association mapping will provide QTLs that regulate the phytochemicals in rice. The genetic diversity and structure of the population in association mapping will be helpful for detecting markertrait associations that may be useful for trait enhancement in molecular breeding programs. In order to avoid spurious marker-phenotype association, population structure (Q) with relative kinship (K) analyses are used to check and correct the panel population composition for linkage disequilibrium (LD) mapping analyses [21][22][23][24]. The association estimates based on both the generalized linear model (GLM) and mixed linear model (MLM) are considered appropriate for mapping complex traits, and have been shown to perform better than other model analysis.
In the present investigation, association mapping was performed in a panel population containing 117 genotypes (64 white and 53 red grain) shortlisted through phenotyping of six phytochemical traits (CAT, PEROX, TPC, DPPH, FRAP and CUPRAC) from 270 shortlisted diverse genotypes of India. The population was studied for the population genetic structure, diversity and association of molecular markers with these phytochemical traits.

Phenotyping of the Population for Antioxidant Traits in Rice
A total of five antioxidant traits of rice grain, namely, catalase, peroxidase, DPPH, FRAP and CUPRAC, along with antioxidant compounds and phenolic content, were estimated from 270 genotypes during the wet season of 2018 (Supplementary Table S1). Wide variation was observed for the six antioxidant traits among the germplasm lines. Each antioxidant trait in the population was grouped into five classes and the germplasm lines were classified into various groups ( Figure 1). The frequency distribution of each group or population is presented in Figure 1. A representative mapping population for the six antioxidant traits was developed by shortlisting 117 germplasm lines from all groups and traits from the initial 270 germplasm lines (Table 1; Figure 2). The average values of the six antioxidant traits estimated from the panel population exhibited wide variation among the germplasm lines for the studied traits (

Genotype-by-Trait Biplot Analysis for Relatedness among the Germplasm Lines for the Antioxidant Traits
The genotype-by-trait biplot diagram was generated using the first two principal components of the six antioxidant properties estimated from the mapping population of the panel containing 117 germplasm lines ( Figure 3). A total of 53.14 and 19.57% of the total variability, and eigen values of 3.189 and 1.17, were estimated for the 1st and 2nd principal components, respectively (Supplementary Figure S1). Peroxidase contributed maximum diversity followed by catalase and FRAP among the six antioxidant traits studied using the panel population ( Figure 3). The distribution pattern of germplasm lines in the four quadrants of the biplot indicate that genotypes carrying high concentrations of antioxidants are seen in the 1st (top right) and 2nd (bottom right) quadrants. Higher concentrations of multiple antioxidant compounds and enzymes containing germplasm lines are depicted in the circle located in these two quadrants (

Nature of Association among Antioxidant Traits
The association among six antioxidant traits revealed a strong positive correlation (r ≥ 0.7) of DPPH with TPC, FRAP and CUPRAC. Moreover, a strong correlation was noticed in FRAP with CUPRAC ( Figure 4). However, TPC showed a moderate positive correlation (r: 0.5-0.7) with FRAP, but with CUPRAC a weak positive correlation (r < 0.5) was observed. Although CAT and CUPRAC showed a negative correlation with PEROX, the association was not significant ( Figure 4). These positively or negatively correlated antioxidant traits may be controlled by the closely linked genes or because they may be structurally related. Therefore, a variety that accumulates high concentrations of one antioxidant may contain higher quantities of other correlated antioxidants.   CAT: catalase (unit min −1 g −1 ); PEROX: guaiacol peroxidase (unit min −1 g −1 ); TPC: total phenolics content (mg catechol or CE 100 g −1 ); DPPH: 2, 2-diphenyl-1-picrylhydrazyl (% inhibition); FRAP: ferric reducing antioxidant power activity (µg ascorbic acid equivalent or AAE) g −1 ; CUPRAC: cupric ion reducing antioxidant capacity (µg trolox equivalent or TE) g −1 ).

Genetic Diversity Parameters Analysis
The representative mapping population containing 117 germplasm lines that exhibited wide variation for the six antioxidant traits was genotyped using 131 SSR markers. The genetic diversity parameters assessed from the panel population are shown in the Supplementary Table S2. The population showed a total of 508 marker alleles with mean alleles of 3.74 per locus detected by the genotyping results using 131 SSR markers. The number of detected alleles varied from 2 to 7 per marker per locus. The highest number of alleles was produced by the marker RM493 in the studied population for antioxidant contents in rice. The major allele frequency parameter was used for detection of variation by a marker in the population. The average major allele frequency linked to the polymorphic markers was computed to be 0.5598, with a range of 0.282 (RM493 and RM8044) to 0.923 (RM6054) (Supplementary Table S2). Informative genetic markers were detected by the PIC values. These showed a range of 0.142 (RM6054) to 0.789 (RM493), with a mean value of 0.496. The mean heterozygosity (Ho) observed from the population was 0.111, and heterozygosity varied from 0.00 to 0.957 (RM3735). A total of 23 marker loci showed a heterozygosity (Ho) value of 0.00 in the panel population. The mean gene diversity (He), which gives a measure of genetic diversity in the panel population, was 0.556, and gene diversity varied from 0.145 (RM6054) to 0.814 (RM493).

Population Genetic Structure Analysis
STRUCTURE 2.3.6 software was used for analysis of genetic structure by applying probable subpopulations (K) at a higher delta K-value in the used diverse population for the six antioxidant traits. The rate of change in the log probability of data between successive K values was the delta K value used in the analysis. The panel population was classified into two subpopulations by assuming K = 2 and using a ∆K peak value of 309.09 (Supplementary Table S3; Supplementary Figure S2). The two subpopulations were in the proportions of 0.727 and 0.273 for population 1 and population 2, respectively. The two subpopulations showed correspondence with antioxidant traits carrying genotypes present in the studied population; however, many germplasm lines were also included that did not show a correspondence with the six antioxidant traits. Therefore, the next ∆K peak value of 81.79 at K = 3 was compared, in which the population was classified into three subpopulations, and each subpopulation showed a reasonable correspondence with the majority of the members, and comparatively better correspondence than at K = 2 for the correlation of the germplasm lines with the six antioxidant traits. The three subpopulations showed proportions of the inferred cluster value of 0.69, 0.206 and 0.104 for the subpopulations 1, 2 and 3, respectively. The Fst1, Fst2 and Fst3 values were 0.168, 0.356 and 0.362 for the subpopulations 1, 2 and 3, respectively (Supplementary Table S3; Supplementary Figure S3). The ancestry value of ≥80% obtained in a genotype categorized the genotype into the particular subpopulation.
The majority of the germplasm lines with high to very high antioxidant traits were present in the subpopulation 2. The moderate value antioxidant properties containing germplasm lines were present in the subpopulation 3. The majority of the germplasm lines showing poor to moderate values for antioxidants were in subpopulation 1 (Table 2; Figure 5). A low alpha value (α = 0.0441) was estimated for the panel population by the structure analysis at K = 3. Positively skewed leptokurtic distributions were observed for the mean alpha value, whereas normally skewed leptokurtic distributions detected for each of the three Fst values of the three subpopulations showed a distinct variation in the distribution among the Fst values (Supplementary Figure S4).      Table 1.

Molecular Variance (AMOVA) and LD Decay Plot Analysis
Plants related by ancestry or by traits in a population are grouped into different population structures. The genetic variations within and between the subpopulations were computed at K = 3 for the analysis of molecular variance (AMOVA) ( Table 3). The genetic variation among the populations estimated at K = 3 was computed to be 1%; among individuals it was 4%, and there was 95% variation within individuals in the panel population. Wright's F statistic was used to determine the deviation from Hardy-Weinberg's prediction. The parameter F IT for individuals within the total population for differentiation and F IS for the uniformity of individuals within the subpopulation in a population were computed. The F IS and F IT values within the population and the total population estimated on the basis of 131 marker loci were 0.045 and 0.051, respectively, whereas the total population had an F ST value of 0.006 between the three subpopulations. Fst is used to indicate the subpopulations or population differentiation within the total population. A clear differentiation between the three subpopulations was observed from their distribution pattern based on the Fst values (Supplemental Figure S4). The association of alleles by different loci in a nonrandom manner is utilized in the marker-trait association analysis. The existence of marker-trait association is dependent on the LD decay rate in a population over a time period. The LD decay rate indicates the possibility of new genes or allelic variants controlling the antioxidant compounds associated with molecular markers for these traits. The syntenic r 2 value was used to plot the linkage disequilibrium decay of the population versus the physical distance in million base pairs ( Figure 6A). Tightly linked markers had a higher r 2 value and the average r 2 values rapidly decreased as the linkage distance increased. In the LD plot, it is observed that the LD decay in the beginning was delayed in the studied panel populations. However, a decline in the LD decay can be noted in the curve for the associated markers at about 1-2 megabase pairs and, thereafter, a gradual and very slow decay can be noted. The graph clearly indicates the continuance of linkage disequilibrium decay in the population for the studied antioxidant properties in the rice population. The limitation for the LD decay depends on non-random mating, mutation, selection, migration or admixture, and genetic drift influences the estimates of LD. This LD decay plot also provides clues about the creation of genetic admixture groups for various antioxidant compounds in the normal population. A similar trend was also noted in the marker 'P' versus marker 'F', and the marker R 2 ( Figure 6B) curve. The detected markers from this study indicate the strength of the markers for the studied antioxidant traits.

Principal Coordinates and Cluster Analyses for Genetic Relatedness among the Germplasm Lines
The two-dimensional plot for principal coordinate analysis (PCoA) was constructed based on the genotyping results of 131 SSR markers, and was used to classify the 117 germplasm lines as per the genetic relatedness among the lines (Figure 7). The inertia shown by component 1 was 12.29%, whereas component 2 exhibited 7.35%. The germplasm lines were assigned to the four quadrants at different places, forming three major groups ( Figure 7). The biggest group accommodated all the germplasm lines of the subpopulations 2 and 3, and was clustered in the 2nd (bottom right) quadrant. The genotypes in the 1st quadrant were divided into two groups, of which one group on the top of the 1st quadrant form the SP3 subpopulation, which has low to very low antioxidant properties in the seeds. The other group near to axis 1 comprises only the admix type of germplasm lines. Several germplasm lines of quadrant 2 and those closer to axis 1 are also admix genotypes. The admix genotypes present on both sides of axis 1 are depicted in red (Figure 7).
The germplasm lines containing high to very high mean values for the antioxidant traits are grouped together, forming the subpopulation 3. This subpopulation is present on   Ward's clustering approach broadly grouped all the genotypes into two major groups. The largest cluster, cluster II accommodated 65 germplasm lines that all carried very low, low or medium levels of antioxidant properties. Cluster I only had 52 germplasm lines. The dendrogram placed in this cluster all the germplasm lines that were rich in antioxidant traits for at least one compound. This cluster was again subdivided into two subgroups, which were further divided into sub-subclusters. Cluster II was divided into two main subclusters, which were finally divided into small groups. All of the sub-subclusters accommodated in Ward's clustering approach were based on the antioxidant traits present in the germplasm lines ( Figure 8A).
The cluster analysis differentiated the germplasm lines on the basis of genotyping of 131 SSR markers and placed the genotypes into different clusters that corresponded with the studied antioxidant traits. The unweighted-neighbor joining tree differentiated the genotypes into three different clusters ( Figure 8B). The cluster for subpopulation 3 was differentiated from SP 2 by the presence of germplasm lines containing high antioxidant properties, whereas moderate to high containing genotypes were placed in subpopulation 2. The green-colored portion of the tree is designated SP2 whereas SP3 is shown in blue. The very poor antioxidant properties carrying germplasm lines were in subpopulation 3. The majority of the germplasm lines present in subpopulation 1 were poor to medium in antioxidant traits and are shown in pink. The germplasm lines with an admix type of population are depicted in red in the neighbor joining tree ( Figure 8B).

Marker-Trait Association for Antioxidant Traits in Rice
Marker-trait associations for total phenolic content; catalase and peroxidase for antioxidant enzymes; DPPH and FRAP for antioxidant activities; and CUPRAC for antioxidant capacity were computed by using the generalized linear model (GLM) and mixed linear model (MLM/ K+Q model)) in the TASSEL 5 software. The marker-trait association values were compared at less than 1% error, i.e., 99% confidence (p < 0.01). Five traits showed significant association with 43 SSR markers by GLM, and four traits with 14 SSR markers by MLM analysis at p < 0.01. The marker R 2 values varied from 0.05438 to 0.12875 by GLM, and from 0.06324 to 0.12586 by the mixed linear model (Supplementary Tables S4 and S5). A total of 11 significant marker-trait associations were detected by both the models for four antioxidant traits at p < 0.01 in the seeds of the germplasm lines ( Figure 9A). Three significant marker-trait associations were detected for each of the traits, DPPH, FRAP, CUPRAC, and catalase (Table 4; Figure 9A). The Q-Q plot also confirmed the association of these markers with the associated antioxidant traits in rice ( Figure 9B). Two markers, namely, RM1341 and RM3231 showed significant associations with the antioxidant enzyme, catalase, analyzed by GLM and MLM models at p < 0.01, and were present on chromosomes 11 and 8, respectively. The QTLs controlling the antioxidant activity of FRAP showed an association with SSR markers RM247 and RM309 present on chromosome 12. RM3701, which was present on chromosome 11, also showed a significant association with FRAP in both of the models. The CUPRAC assay was found to be significantly associated with marker RM235 present on chromosome 12 at the 101.8 cM position. Moreover, the enzyme was strongly associated with RM148 located at 142.3 cM on chromosome 3, as detected by both the models. The antioxidant activity measured by DPPH was found to be significantly associated with the markers RM247 and RM3701 present on chromosomes 12 and 11, respectively (Table S5; Figure 9A). The Q-Q plot also confirmed the associations of these markers with the estimated antioxidant traits in rice ( Figure 9B). The association mapping study for the antioxidant traits in rice seeds identified colocalization of QTLs controlling antioxidant properties in rice. It is observed that the same markers showed significant associations with different antioxidant traits in rice in both of the models (Table 4). Significant associations of marker RM3701 with antioxidant activities of DPPH, CUPRAC and FRAP present in the germplasm lines were detected. In addition, the association of RM247 with antioxidant activity of DPPH and FRAP was also detected by both of the models at <1% error and p < 0.01 (Table 4). In analysis of marker association by GLM, the markers RM468 and RM167 showed associations with both DPPH and FRAP activities.

Discussion
Rice is the staple food for the majority of the world population. Many antioxidant compounds and enzymes are present in different rice germplasm lines that provide health benefits. Mapping of these genes for regulating the antioxidant traits in rice germplasm lines and their deployment in breeding programs are very important for enhancing the content in rice grains. The germplasm lines shortlisted for this study showed variation among the lines for the antioxidant traits in the population (Supplementary Table S1; Table 1). The results showed that few antioxidant traits showed correlation among them and will be useful for simultaneous transfer of multiple antioxidant traits into the popular varieties. Therefore, there are possibilities for improvement of the antioxidant enzymes (such as catalase and peroxidase), antioxidant activities (namely, DPPH and FRAP) and antioxidant capacity (by CUPRAC assay), in rice based on the results from genetic variation and correlation obtained from the population ( Table 1; Table 3). A number of studies on the existence of genetic variations for antioxidant content and activities have also been reported by researchers [25][26][27][28]. In addition, clear groups and subgroups were obtained in the phenotyping and molecular diversity analyses of antioxidant enzymes present in the population (Supplementary Table S2). The SSR markers showed better PIC values and related diversity parameters in the studied population; this result will be useful in antioxidant improvement programs. The germplasm lines used in this mapping study were from the rice reported for high diversity areas, including the Jeypore region, the secondary center of the origin of rice [29].
Several germplasm lines were identified as potential donors in which more than two traits for antioxidant enzymes and high activities were observed in the seeds. The genotypes rich in multiple antioxidant traits were Kundadhan, Latachaunri, AC.20282, AC.20246, AC.44646, AC.44595, AC.43737, AC.43660, AC.43732, AC.43738 and AC.43670, which will be useful as donors in antioxidant improvement programs (Table 1; Supplementary Table S3). Therefore, inclusion of the panel population for mapping of antioxidant traits will be effective. Structure analysis grouped the population into three subpopulations with different Fst values for each genetic group. The existence of genetic groups supported the continuance of linkage disequilibrium groups in the population. Detection of a moderate alpha value and the existence of many genetic admix-type germplasm lines in the population indicated that these antioxidant traits initially evolved from a single source during evolution of the trait. Different antioxidant compounds, enzymes and activities were formed in admix genotypes with different ancestry values during evolutionary process. A good correspondence of genetic structure group and different traits was found earlier by many researchers [10,[30][31][32][33][34][35][36].
Four antioxidant traits were found to be significantly associated with 11 SSR markers analyzed by both GLM and MLM approaches ( Table 4). The marker-trait associations detected by both the models at p < 0.01 and low p-value are considered to indicate very strong associations, and the markers will be useful in improvement programs. The strongly associated SSR markers-namely, RM1341 and RM3231 for catalase enzyme; RM247, RM309 and RM3701 for FRAP activity; RM235 and RM148 for CUPRAC assay; and RM247 and RM3701for DPPH activity-may be useful markers in marker-assisted antioxidant enzyme improvement programs in rice ( Table 4). The Q-Q plot also confirmed the associations of these markers with the antioxidant compounds in rice ( Figure 9B).
The QTLs for antioxidant capacity, i.e., DPPH, were reported by earlier researchers in rice [18][19][20]. In this investigation, the marker-trait associations for DPPH were detected with RM247, RM3701 and RM13600 present on chromosomes 12, 11 and 2, respectively. As previously report for the QTL, qACD12 on chromosome 12 was at the 252.06 Mb location [19]. We detected the QTL associated with marker RM247 at 31.85 Mb, which is a different locus present on chromosome 12. The detected QTL on this chromosome is a new locus and designated as qACD12.2, which regulates the DPPH activity in rice seeds. The QTL qACD2, reported on chromosome 2 by Shao et al. [19], was at 54.16 Mb. In our investigation, we detected the QTL on chromosome 2 near the location of 242.46 Mb for the trait. The associated QTL on this chromosome may be a new locus and designated as qACD2.2, which influences the DPPH activity in rice seeds. Another mapping study for DPPH reports the QTL on chromosome 7 [18]. In addition, the mapping publication of Xu et al. [20] for the trait reports the QTL on chromosome 11. However, in this investigation, a QTL was detected on chromosome 11, which is in contrast to the above-reported QTLs. Therefore, this may be a new locus that regulates antioxidant activity, DPPH, and is designated as qACD11.1.
The markers RM1341 and RM3231 were significantly associated with the antioxidant enzyme, catalase, as detected by both GLM and MLM analyses. The locations of these two markers are chromosomes 11 and 8 at 80.2 and 32.7 cM, respectively. No genes or QTLs were reported previously near to this location for the enzyme catalase, and hence the two QTLs are designated as qCAT11.1 and qCAT8.1, respectively. Three markers, namely, RM247, RM3701 and RM309, showed significant association with antioxidant activity, FRAP. The markers positions of RM247 and RM309 are chromosome 12 at 31.85 and 214.54 Mb, respectively. The other marker, RM3701, is located on chromosome 11 at 81.001 Mb. QTL regulating the FRAP activity was not reported near to these locations on chromosomes 11 and 12, or any other regions or chromosomes, in earlier publications. These QTLs controlling FRAP are designated as qFRAP12.1 near marker RM247 and qFRAP12.2 near RM309 on chromosome 12, and qFRAP11.1 on chromosome 11. The antioxidant enzyme, CUPRAC, is detected to be associated with three markers RM3701, RM235 and RM148 on chromosomes 11, 12 and 3 at 81.001, 261.07 and 358.35 Mb, respectively. No QTLs for antioxidant capacity, CUPRAC, were reported earlier in these positions on the chromosomes 3, 11 and 12, or any other regions or chromosomes. Therefore, these three detected QTLs that influence CUPRAC are new loci and designated as qCUPRAC3.1, qCUPRAC11.1 and qCUPRAC12.1, respectively.
In this investigation, more than two significant associations were observed in one location analyzed by both the models at <1% error and p < 0.01. QTLs present on these locations will be useful for the simultaneous transfer of a greater number of traits. QTLs regulating the antioxidant activities for the assay of DPPH, FRAP and CUPRAC were detected to be co-localized on chromosome 11 near the region 81.001 Mb. This region on chromosome 11 may be considered as an antioxidant hotspot for regulating the activity in rice. Moreover, QTLs for DPPH and FRAP are co-localized on chromosome 12 at the 31.85 Mb position, which is also a hotspot on this chromosome for antioxidant activities (Table S5). These observations of the co-localized candidate genes on the chromosome indicate the usefulness for simultaneous inheritance of these QTLs during improvement programs. Hence, improvement for QTLs regulating the antioxidant enzymes will be very effective in such breeding programs. Recent publications also suggested easy improvements in the co-localized genes controlling various traits in rice [23,24,36,37]. Results of the present investigation showed that association mapping is an effective method to detect more potential loci for antioxidant enzymes and compounds in rice. Additional fine mapping of the detected loci will be undertaken for application in maker-assisted breeding for improvements in antioxidants in rice.

Seed Material
The study materials, consisting of 270 germplasm lines, comprising white (121) and colored (149) rice grain landraces and cultivars, were obtained from Gene Bank, ICAR-NRRI, Cuttack, and grown in the experimental plot of the Institute during the wet season, 2018 (Supplementary Table S1). The initial population was shortlisted on the basis of maturity duration (up to 135 days) and kernel color (red, black, purple and white) from about 1000 germplasm lines. The genotypes were grown in a randomized complete block design in three rows each with spacing of 20 × 15 cm in three replications following a recommended package of practices. Each replication was divided into 5 blocks accommodating 54 germplasm lines in each block. Panicles from the middle row of each genotype and replication were harvested, sun dried for 4-5 days to reduce the moisture content to 11-12%, stored for three months to remove dormancy and then used for estimation of CAT, PEROX, TPC, DPPH, FRAP and CUPRAC activity. A representative panel population containing 117 germplasm lines (white grain: 64; red grain: 53) was shortlisted from the initial 270 shortlisted germplasm lines. The panel population was raised during wet seasons of 2019 and 2020, and the antioxidant traits were estimated. The panel population (117) was used for mapping of antioxidant traits (Table 1).

Phenotyping for the Antioxidant Traits
The seed samples were dehulled by a Satake rice huller, Japan, ground into flour by a grinding machine (Glenmini grinder), sieved through 100-size mesh and stored at 4 • C for the experiment. Seed enzymatic antioxidants such as catalase (CAT: unit min −1 g −1 ) and guaicol peroxidase (PEROX: unit min −1 g −1 ) were estimated as per the procedures of Aebi [38] and Putter [39], respectively. Non-enzymatic antioxidants such as total phenolics content (TPC) were determined by the modified protocol of Zilic et al. [40] and expressed as catechol equivalent (mg CE100 g −1 ). DPPH (2,2-diphenyl-1-picrylhydrazyl) radical scavenging assay was estimated according to the method of Zhou et al. [41] with little modification, and expressed as % inhibition. Ferric reducing antioxidant power (FRAP) activity was measured as per the modified procedure of Mau et al. [42] and results were expressed as µg ascorbic acid equivalent (AAE) g −1 . Cupric ion reducing antioxidant capacity (CUPRAC) was determined according to the method of Apak et al. [43] and the result was expressed as µg trolox equivalent (TE) g −1 .

Statistical Analysis
Cropstat software 7.0. was used for analysis of variance (ANOVA) for each trait, including the estimation of mean, range and coefficient of variation (CV %). Pearson's correlation coefficients were analyzed to determine the relationship among the various antioxidant traits based on the mean values of the 117 genotypes, and presented in a correlation matrix heatmap using PAST3 software. The germplasm lines were classified into five groups, i.e., very high, high, medium, low and very low categories, based on the mean values of the antioxidant traits.

Genomic DNA Isolation, PCR Analysis and Selection of SSR Markers
Fifteen-day-old plants were used for extraction of genomic DNA from the germplasm lines by adopting the CTAB method [44]. A total of 131 simple sequence repeat (SSR) rice markers across the 12 chromosomes were taken from the database (http://gramene.org/) available in the public domain (Supplementary Table S2). The DNA fragments were resolved in gel electrophoresis for quantification of isolated DNA. PCR analysis was performed using the markers selected based on positions covering all the chromosomes to illustrate the diversity and to identify the polymorphic loci among the 117 rice germplasm lines ( Table 1). The conditions of reaction were set to an initial denaturation step (4 min, 94 • C), followed by 35 cycles of denaturation (30 s, 95 • C) and annealing (45 min, 55 • C), extension (1.3 min, 72 • C), final extension (10 min, 72 • C) and storage at 4 • C (infinity). The PCR products were electrophoresed using 3% agarose gel containing 0.80 g ml −1 ethidium bromide. A 50 bp DNA ladder was used to determine the size of amplicons. The gel was run up to 4 h at 2.5 V cm −1 and photographed using a Gel Documentation System (SynGene). Earlier publications of Barik et al. [45,46] and Pradhan et al. [47] were followed for DNA isolation, electrophoresis and imaging techniques.

Molecular Data Analysis
The presence or absence of amplified products obtained on the basis genotype-primer combination was used to score the data. A binary data matrix was used as discrete variables for the entry of the resulting data. The parameters, namely, polymorphic information content (PIC), observed heterozygosity (H), number of alleles (N), major allele frequency (A) and gene diversity (GD), for each SSR locus were analyzed using 'Power Marker Ver 3.25' software [48]. The Bayesian model-based clustering software, STRUCTURE 2.3.6, was used to analyze the genetic data and for population structure [49]. STRUCTURE software was run with K varying from 1 to 10, with 10 iterations for each K value to derive the ideal number of groups. A high throughput parameter set of a burn-in period of 150,000, followed by 150,000 Markov Chain Monte Carlo (MCMC) replications, were adapted during the running period. The highest value of ∆K was obtained from Evanno table used to detect the subpopulation groups from the panel of populations in the next step. The maximal value of L (K) was identified using the exact number of subpopulations. The model choice criterion to detect the most probable value of K was ∆K, and an ad hoc quantity related to the second-order change in the log probability of data with respect to the number of clusters inferred by STRUCTURE was adopted [50]. For estimation of the ∆K value which is function of K, a clear peak was determined as the optimal K value [51]. Structure Harvester was used. The principal coordinate analysis of all the genotypes and unweighted neighbor joining unrooted tree for NEI coefficient dissimilarity index [52] with bootstrap value of 1000 were undertaken using DARwin5 software [53]. Analysis of molecular variance (AMOVA) using GenAlEx 6.5 software was used to estimate the presence of molecular variance across the whole population, within a population and between the subpopulation structures (F IT , F IS , F ST ), calculated by the deviation from the Hardy-Weinberg expectation. The procedures followed in earlier publications were adopted for molecular data analysis [23,30,54,55].
To analyze the marker-trait association for mapping study of the seed antioxidant traits in rice, the software "TASSEL 5.0" was used. The generalized linear model and mixed linear model in TASSEL 5.0 were used to determine the genetic association between the phenotypic traits and molecular markers obtained in the study [56]. By considering the significant p-value and r 2 value detected by both the models, the associated markers were identified. The associations of markers were further confirmed by the Q-Q plot generated by the software. The linkage disequilibrium plot was obtained using the LD-measured r 2 between pairs of markers, and plotted against the distance between the pairs. In addition, the accuracy of the marker-trait association was checked by estimating the FDR-adjusted p-values (q-values) using R software, as described in the earlier publications [23,30].

Conclusions
The representative panel population developed by shortlisting 117 germplasm lines based on six antioxidant trait phenotypic groups showed wide genetic variation among the germplasm lines. Moreover, the population showed higher diversity parameters based on 131 SSR marker allele data. Therefore, the choice of mapping population was effective for the association mapping study of the six antioxidant traits, viz., catalase, peroxidase, CUPRAC, DPPH, FRAP and TPC, using the population of 117 lines and 131 SSR markers. Donor lines rich in multiple antioxidant traits were identified from the population for antioxidant improvement programs. The population was classified into three genetic groups and showed reasonable correspondence with the antioxidant traits. The presence of a linkage disequilibrium in the population was confirmed from the estimated Fst values. A total of 11 significant marker-trait associations for antioxidant enzymes and activities was detected for three QTLs, namely, qACD2.1, qACD11.1 and qACD12.2 for DPPH; qCAT8.1 and qCAT11.1 for catalase; qFRAP11.1, qFRAP12.1 and qFRAP12.2 for FRAP, and qCUPRAC3.1, qCUPRAC11.1 and qCUPRA12.1 for CUPRAC. Co-localization of the QTLs was detected for qACD11.1, qFRAC11.1 and qCUPRAC11.1 regulating DPPH, FRAP and CUPRAC activities, respectively, and qACD12.2 and qFRAP12.1 remained close on chromosome 12. These QTLs will be useful in antioxidant improvement programs in rice. Funding: No externally aided fund was availed for this research work. However, Institute's internal funding was used for this investigation.