Genetic Dissection of Grain Nutritional Traits and Leaf Blight Resistance in Rice

Colored rice is rich in nutrition and also a good source of valuable genes/quantitative trait loci (QTL) for nutrition, grain quality, and pest and disease resistance traits for use in rice breeding. Genome-wide association analysis using high-density single nucleotide polymorphism (SNP) is useful in precisely detecting QTLs and genes. We carried out genome-wide association analysis in 152 colored rice accessions, using 22,112 SNPs to map QTLs for nutritional, agronomic, and bacterial leaf blight (BLB) resistance traits. Wide variations and normal frequency distributions were observed for most of the traits except anthocyanin content and BLB resistance. The structural and principal component analysis revealed two subgroups. The linkage disequilibrium (LD) analysis showed 74.3% of the marker pairs in complete LD, with an average LD distance of 1000 kb and, interestingly, 36% of the LD pairs were less than 5 Kb, indicating high recombination in the panel. In total, 57 QTLs were identified for ten traits at p < 0.0001, and the phenotypic variance explained (PVE) by these QTLs varied from 9% to 18%. Interestingly, 30 (53%) QTLs were co-located with known or functionally-related genes. Some of the important candidate genes for grain Zinc (Zn) and BLB resistance were OsHMA9, OsMAPK6, OsNRAMP7, OsMADS13, and OsZFP252, and Xa1, Xa3, xa5, xa13 and xa26, respectively. Red rice genotype, Sayllebon, which is high in both Zn and anthocyanin content, could be a valuable material for a breeding program for nutritious rice. Overall, the QTLs identified in our study can be used for QTL pyramiding as well as genomic selection. Some of the novel QTLs can be further validated by fine mapping and functional characterization. The results show that pigmented rice is a valuable resource for mineral elements and antioxidant compounds; it can also provide novel alleles for disease resistance as well as for yield component traits. Therefore, large opportunities exist to further explore and exploit more colored rice accessions for use in breeding.

The accessions showed wide variations in response to 14 different Xoo strains. The disease score measured as lesion length (LL) showed skewed distributions. The lesion length in response to Xoo strains PXO61 and PXO339 showed the widest and narrowest LL variation, respectively. Figure 2 shows the correlation among the traits. Of the 36 possible correlations among all traits evaluated in both environments, a total of 17 were significant; three of them were positively correlated and 11 were negatively correlated. DH was negatively correlated with NP, RR, and TGW, but positively correlated with NSP; NP was negatively correlated with PL, NSP, and AC; PL was positively associated with NSP and AC; NSP was negatively correlated with RR and TGW; RR was positively associated with TGW, but negatively with AC; and TGW was negatively associated with AC. Similarly, Fe and Zn were negatively correlated with DH and Fe, and Zn had a highly significant positive correlation.  , NP = number of panicles; PL = panicle length (cm); NSP = number of spikelets per panicle; RR = ripening ratio; GL= grain length; GW = grain width; TGW = thousand-grain weight (g); AC = anthocyanin content (mg/100 g); Zn = zinc (mg/kg); Fe = iron (mg/kg); K = Korea; I = IRRI, Philippines.
The accessions showed wide variations in response to 14 different Xoo strains. The disease score measured as lesion length (LL) showed skewed distributions. The lesion length in response to Xoo strains PXO61 and PXO339 showed the widest and narrowest LL variation, respectively. Figure 2 shows the correlation among the traits. Of the 36 possible correlations among all traits evaluated in both environments, a total of 17 were significant; three of them were positively correlated and 11 were negatively correlated. DH was negatively correlated with NP, RR, and TGW, but positively correlated with NSP; NP was negatively correlated with PL, NSP, and AC; PL was positively associated with NSP and AC; NSP was negatively correlated with RR and TGW; RR was positively associated with TGW, but negatively with AC; and TGW was negatively associated with AC. Similarly, Fe and Zn were negatively correlated with DH and Fe, and Zn had a highly significant positive correlation.

Genetic Analysis and Linkage Mapping
GBS data were aligned with the reference genome Nipponbare and SNPs were called. A total of 22,112 SNP markers (with average density of 19.4 kb) covering all 12 chromosomes were used to access the genetic structure of the 152 rice accessions. The number of SNPs on each chromosome varied from 1219 on chromosome 12 to 3315 on chromosome 1. All the other chromosomes had more than 1400 SNPs each. The magnitude of LD and its decay with genetic distance determine the resolution of association mapping. The LD analysis in the colored rice panel identified 809,803 pairs (74.3%) in complete LD ( Table 2). The shortest physical distance group (0-5 kb) had the highest

Genetic Analysis and Linkage Mapping
GBS data were aligned with the reference genome Nipponbare and SNPs were called. A total of 22,112 SNP markers (with average density of 19.4 kb) covering all 12 chromosomes were used to access the genetic structure of the 152 rice accessions. The number of SNPs on each chromosome varied from 1219 on chromosome 12 to 3315 on chromosome 1. All the other chromosomes had more than 1400 SNPs each. The magnitude of LD and its decay with genetic distance determine the resolution of association mapping. The LD analysis in the colored rice panel identified 809,803 pairs (74.3%) in complete LD ( Table 2). The shortest physical distance group (0-5 kb) had the highest average LD (0.653). The decay declined to 0.340 average LD (48% decline) at a physical distance of >750 to 1000 bp, before it increased at a distance of more than 1000 kb ( Figure 3). In addition, the same pattern was observed for the percentage of marker pairs in complete LD, for which it decreased from 36.5%, in the shortest distance group, to 1.7% at a physical distance of >750 to 1000 bp. average LD (0.653). The decay declined to 0.340 average LD (48% decline) at a physical distance of >750 to 1000 bp, before it increased at a distance of more than 1000 kb ( Figure 3). In addition, the same pattern was observed for the percentage of marker pairs in complete LD, for which it decreased from 36.5%, in the shortest distance group, to 1.7% at a physical distance of >750 to 1000 bp.

Structure and Principal Component Analysis
Population structure analysis, using a subset of SNP markers, revealed that the variance of log likelihood increased from K = 1 to K = 10, and the highest ΔK of 3762.2 was observed at K = 2 ( Figure  4a,b), indicating that the population can be divided into two subgroups with 114 and 42 individuals belonging to clusters 1 and 2, respectively. The cluster memberships of each individual (Q1) and kinship data for all traits were used for GWAS. These varieties were assigned to three genetic clusters in a three-dimensional plot of the first three principal components (i.e., PC1, PC2, and PC3). Using a three-dimensional (3D) scatter plot of principal component analysis (PCA), and based on 22,121 SNPs, two major clusters were clearly distinguished among all the colored rice accessions ( Figure 5), which is consistent with results from population structure analysis as it also grouped the accessions into two subgroups. Rice genotypes from cluster 1 were depicted by red color, whereas cluster 2 genotypes were represented by black color. The first three PCs accounted for 45% of the total variance breakdown of this cumulative variance value, which revealed contributions of 35%, 6%, and 4% for PC1, PC2, and PC3, respectively.

Structure and Principal Component Analysis
Population structure analysis, using a subset of SNP markers, revealed that the variance of log likelihood increased from K = 1 to K = 10, and the highest ∆K of 3762.2 was observed at K = 2 ( Figure 4a,b), indicating that the population can be divided into two subgroups with 114 and 42 individuals belonging to clusters 1 and 2, respectively. The cluster memberships of each individual (Q1) and kinship data for all traits were used for GWAS. These varieties were assigned to three genetic clusters in a three-dimensional plot of the first three principal components (i.e., PC1, PC2, and PC3). Using a three-dimensional (3D) scatter plot of principal component analysis (PCA), and based on 22,121 SNPs, two major clusters were clearly distinguished among all the colored rice accessions ( Figure 5), which is consistent with results from population structure analysis as it also grouped the accessions into two subgroups. Rice genotypes from cluster 1 were depicted by red color, whereas cluster 2 genotypes were represented by black color. The first three PCs accounted for 45% of the total variance breakdown of this cumulative variance value, which revealed contributions of 35%, 6%, and 4% for PC1, PC2, and PC3, respectively.

Genome-Wide Association Mapping
In total, 35 QTLs were identified for agronomic and nutritional traits at p < 0.0001 (Table 3 and Figure 6). They were distributed on all the chromosomes, except on chromosome 7, and the phenotypic variance explained (PVE) of these QTLs varied from 9% to 18%. The highest number of QTLs was identified on chromosome 1 (seven), followed by chromosomes 6 and 11, each with five QTLs. On other chromosomes, the number of QTLs varied from one to four. Similarly, for BLB resistance, 22 QTLs were identified on chromosomes 1, 4, 7, 8, 9, 10, 11, and 12. The highest number of QTLs was identified on chromosome 1 (four), followed by chromosomes 4, 7, 8, and 9, each with three QTLs. The PVE of the QTLs varied from 9.9% to 12.9%. Four QTLs were identified for resistance to POX330, and three QTLs each for resistance to PXO61 and PXO99. The probability of false detection

Genome-Wide Association Mapping
In total, 35 QTLs were identified for agronomic and nutritional traits at p < 0.0001 (Table 3 and Figure 6). They were distributed on all the chromosomes, except on chromosome 7, and the phenotypic variance explained (PVE) of these QTLs varied from 9% to 18%. The highest number of QTLs was identified on chromosome 1 (seven), followed by chromosomes 6 and 11, each with five QTLs. On other chromosomes, the number of QTLs varied from one to four. Similarly, for BLB resistance, 22 QTLs were identified on chromosomes 1, 4, 7, 8, 9, 10, 11, and 12. The highest number of QTLs was identified on chromosome 1 (four), followed by chromosomes 4, 7, 8, and 9, each with three QTLs. The PVE of the QTLs varied from 9.9% to 12.9%. Four QTLs were identified for resistance to POX330, and three QTLs each for resistance to PXO61 and PXO99. The probability of false detection rate (FDR) estimation was non-significant for all the QTLs. Details of the QTLs identified for BLB resistance traits are provided in Table 4.

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (

QTLs for Agronomic and Nutritional Traits
1. DH: Eight QTLs were identified for days to heading on chromosomes 1, 3, 5, 6, and 10. Most of them were located on chromosome 5 (
Fe: Two QTLs were identified for grain Fe content, one each on chromosomes 6 and 12. Both had a PVE of more than 10%. 9.
Zn: Five QTLs were identified for grain Zn content on chromosomes 1, 6, and 12, with three of them on chromosome 12. All of them had a PVE of more than 10%; qZn 12.2 had the highest PVE (17.9%). 10. AC: Two QTLs were identified for anthocyanin content, one each on chromosomes 1 and 10, with a PVE of 14.5% and 13.1%, respectively.

QTLs for BLB Resistance
A total of 22 QTLs were mapped only for 11 of the 14 different Xoo strains screened on the colored rice panel. All of the QTLs had a PVE of more than 10%, except qBLB 1.2 , qBLB 1.4 , and qBLB 4.3 (Table 4 and Figure 6). Four QTLs (qBLB 4.3 , qBLB 5.1 , qBLB 7.2 , and qBLB 7.3 ) were identified for resistance to Xoo strain PXO363. Three QTLs each were identified for resistance to Xoo strains PXO61 and PXO99.

Identification of Donor Lines for Grain Zn and Anthocyanin
Among the colored rice samples, six high-Zn lines and another six high-AC lines were identified and are presented in Table 5. Among the selected lines, Zn content ranged from 24.8 to 26.6 mg/kg, whereas AC was more than 295.6 mg/Kg. However, only one accession, namely, Sayllebon, contained higher content of both Zn (24.8 mg/kg) and AC (375.4 mg/kg).

Discussion
Colored rice is a rich source of vitamins and minerals, containing several-fold higher nutrients than regularly consumed white rice, so it can contribute significantly to human health and nutrition [94][95][96].
Thus, efforts are being made to conserve, characterize, and cultivate colored rice accessions and promote the consumption of colored rice and brown rice as part of the major health and nutrition initiatives in many countries of Asia [24,97]. In the present study, we characterized colored rice accessions for nutritional, agronomic, and BLB resistance traits, and carried out association mapping and candidate gene analysis to facilitate the development of healthier rice varieties.
Biofortification has been proven to be one of the most cost-effective methods in combating Fe and Zn deficiencies [40,[98][99][100][101][102]. However, the accumulation of mineral elements in the edible parts is a complex process involving multiple QTLs/genes and is highly influenced by environmental factors [33,103]. Biofortified rice varieties, with high grain mineral concentration, should be high yielding, with desirable grain quality traits, and resistant to major pests and diseases for their successful adoption [16,104]. Thus, an understanding of the molecular basis of all of these complex traits will help in precisely pyramiding several genes and QTLs to develop superior and farmer-adoptive nutritious rice varieties.
Mostly normal frequency distribution was observed for all the nutritional and agronomic traits, and skewed distribution for BLB resistance, indicating their polygenic and oligogenic/monogenic genetic control, respectively, which is the normal trend reported for these traits [105][106][107][108]. Some of the accessions with higher content of Zn and AC were CR0021, Quakor and Trunia and Sayllebon, Filiwa and Koni, respectively. They are useful for breeding as well as to directly promote them as healthier rice. Selection of rice donor parents with multiple beneficial traits seems to be a good strategy to reduce the impact of linkage drag in breeding. Our study found that red rice genotype, Sayllebon, which is high in both Zn and AC, could be a valuable material for a breeding program for nutritious rice. Sayllebon/3-203 was also found to contain a higher amount of antioxidant compound γ-oryzanol (9.1 mg/100 g hulled rice) [109]. Even though higher Fe, Zn, and AC were reported in colored rice, our results showed that there was not much variation for Fe, but wide ranges were observed for Zn (9.2 to 26.6 mg/Kg) and AC [2,35,107,[110][111][112]. A wide range of variability for different traits indicated the role of genotype as well as environmental effects on the expression of these traits. In general, rice accessions have less variability for Fe in the endosperm. Some of the colored rice genotypes may have high Fe in the endosperm; such accessions may be very rare. In contrast, the huge variability available for grain Zn, anthocyanin content, and antioxidant compounds in colored rice can be exploited in breeding programs.
The correlations among nutritional and agronomic traits exhibited known trends. DH was negatively correlated with yield components, such as NP, RR, and TGW, and positively correlated with NSP; TGW was negatively associated with AC but, interestingly, it did not show any significant relationship with Fe and Zn; AC was negatively correlated with most of the yield components and significantly positively correlated only with PL. Similarly, Fe and Zn were negatively correlated with DH, whereas Fe and Zn had a highly significant positive correlation. Number of spikelets per panicle (NSP) is an important trait to determine the number of grains per unit area. Large variation in NSP was reported in previous studies [113][114][115]. According to Kato (1986) [116], RR is usually low in rice genotypes with a higher number of grains, which is in agreement with our results. Further, the significant positive correlation identified between Fe and Zn was supported by earlier findings [117,118]. Fe and Zn share the same genomic region or genes or biochemical pathways [119,120]. Even though QTLs/genes for highly correlated traits are co-located, they may be tightly linked or pleiotropic, but linkage drag must be eliminated through pre-breeding or precise introgression of the target region in elite genetic backgrounds [121].
The SNP density used in the analysis was very high, with an average density of one SNP for every 19.4 kb. All the chromosomal regions were well covered without any significant gaps. A high-density SNP is desired for accurate QTL/gene detection [122,123]. In our population, limited subgroups and a higher number of marker pairs in complete LD were detected. The structure and PC analyses clearly grouped the accessions into two subgroups without admixtures, indicating free flow of genes within groups and less gene pool sharing across groups. We detected 74.3% of the marker pairs in complete LD, with an average LD distance of 1000 kb, and, interestingly, one-third (36%) of the LD pairs were less than 5 Kb, indicating a high rate of recombination in the panel. The identification of subgroups and generating kinship data for use in association analysis are important to avoid spurious associations [124,125]. The low average LD distance with high recombination is essential for any association analyses to accurately detect the precise location of QTLs/genes [122]. We used structure with kinship (Q+K) information for GWAS analyses.
Genome-wide association study is widely used for genetic analyses to exploit genetic diversity in rice and to identify novel alleles for multiple traits [119,[126][127][128][129][130]. In our study, GWAS detected multiple major loci for all the traits except for NSP and TGW. In total, 35 QTLs were identified for nutritional and agronomic traits at p < 0.0001 (Table 4). They were distributed on all the chromosomes, except on chromosome 7, and the PVE of these QTLs varied from 9% to 18%. The highest numbers of QTLs were identified for DH and RR. It is also interesting to note that 18 (51%) QTLs for agronomic and micronutrient traits were co-located with either known QTLs or genes for the respective traits. Previous studies reported several QTLs and genes for agronomic, yield, and nutritional traits in various biparental and natural mapping populations in rice [119,[131][132][133][134][135][136][137]. For BLB resistance, 22 QTLs were identified; 12 (54%) of them were co-located with known BLB genes or with their functionally-related genes. The PVE of the QTLs varied from 9.9% to 12.9% (Table 4). The co-locations of candidate genes with QTLs provide the evidence for accuracy in mapping of genetic loci. With the recent advancements in rice genomics, there has been increasing accumulation of information on QTLs and genes for various traits in rice using different approaches [138,139]. Some of the genes for disease resistance, nutritional and grain quality, and agronomic traits have been cloned, functionally validated, and successfully used in breeding programs [139,140]. However, there is a need to continuously search for novel alleles to cater to the diverse needs of breeding programs to mitigate the adverse effects of climate change and to meet the food and nutritional demands of farmers and consumers [38].
Prioritization of the candidate genes underlying major-effect QTLs for complex traits, and their functional validation, is necessary to understand their influence on phenotype and also to develop functional markers for use in breeding [141][142][143][144]. Several QTLs identified in this study were co-located with known or functionally-related genes. qAntc 1.1 was co-located with the glu4 gene, which is related to seed glutelin quality (eating quality). The DH QTLs were co-located with PME1, lsi2, siz1, OsLti6b, dth1.1, OsCrRLK1L2ˆ, and OsFD2. PME1, siz1ˆ, and OsFD2 genes are known to be involved in anther and leaf development [53,55,57]. qGL 4.1 co-located with GIF1, the gene that regulates grain size and grain filing in rice [59]. Similarly, grain weight QTLs qGW 1.2 , qGW 6.1 , and qGW 7.1 were co-located with OsaLeg1, SSG6, qgw1.1, qgrl1-1, AQED046, and SDGP7. SSG6 is a substandard starch granule-6 gene that is involved in the development of large starch granules in the endosperm [61], whereas AQED46 is related to 1000-grain weight [62]. The results of this study suggest that SSG6 is a novel protein that controls grain size. SSG6 will be a useful molecular tool for future starch breeding and applications. For Fe and Zn, three of the seven QTLs were co-located with the metal homeostasis genes. These were zinc ion transporter, OsCNGC16, OsDof, OsHMA9, OsNRAMP3, OsbZip85, OsNRAMP7, and ZFP252.
Among the BLB resistance QTLs, eight of the 22 QTLs (qBLB 1.4 , qBLB 4.3 , qBLB 5.1 , qBLB 7.1 , qBLB 7.3 , qBLB 8.1 , qBLB 8.3 , and qBLB 11.2 ) were co-located with 12 known BLB genes or with their functionally-related genes (OsLOL2, Xa1, OsWRKY45, xa5, xa8, oscbt, rtGA2.1, OsGLP8, Os8N3, xa13, Xa3, and Xa26), respectively. Furthermore, the BLB resistance gene-rich region of chromosome 8 was identified with two QTLs (qBLB 8.1 and qBLB 8.3 ) in which OsGLP8, Os8N3, and xa13 were found to be co-located. qBLB 11.2 also co-located with two important BLB genes, Xa3 and Xa26. This qBLB 11.2 was also identified by other studies [145][146][147][148][149]. GWAS also identified SNP markers near known genes that are related to disease resistance and stress tolerance such as OsRLCK19, OsLOL2, and OsGLP8, which are plausible candidate resistance genes on chromosomes 1 and 8. Similarly, a GWAS in a ulti-Parent Advanced Generation Inter-Cross (MAGIC) population identified several major and minor loci for BLB resistance [150]. Although major BLB resistance genes have been successfully introgressed using marker-assisted breeding, widespread use of these varieties narrows the genetic base, resulting in high selection pressure against the prevailing BLB strains, resulting in more virulent strains that could overcome the major R genes of rice. Thus, pyramiding of small-effect BLB QTLs along major genes is necessary for a more sustainable BLB resistance [106,151]. This qBLB 11.2 region was also very close to the location of four reported blast resistance genes, namely, Pik1, Pik2, Pikm, and Pik-p [152][153][154]; thus, qBLB 11.2 might be useful for the development of BLB and blast disease resistance in rice.

Conclusions
This study reports QTLs identified for nutritional, agronomic, and BLB resistance traits using GWAS in a colored rice diversity panel. Colored rice accessions have higher nutritive value than non-pigmented rice genotypes as they contain higher levels of micronutrients and antioxidant compounds. The lines with higher nutritional value identified in this study could be used as donor parents for nutritional traits, and crossed with high-yielding mega-varieties to develop new high-yielding and nutritious rice cultivars. Furthermore, the identified BLB resistance loci might be beneficial in developing rice cultivars resistant to BLB. The introgression of major-effect QTLs for the desired traits identified in this study will also enhance the efficiency of marker-assisted breeding programs.

Conflicts of Interest:
The authors declare no conflict of interest.