Genome-Wide Identification of QTLs for Grain Protein Content Based on Genotyping-by-Resequencing and Verification of qGPC1-1 in Rice

To clarify the genetic mechanism underlying grain protein content (GPC) and to improve rice grain qualities, the mapping and cloning of quantitative trait loci (QTLs) controlling the natural variation of GPC are very important. Based on genotyping-by-resequencing, a total of 14 QTLs were detected with the Huanghuazhan/Jizi1560 (HHZ/JZ1560) recombinant inbred line (RIL) population in 2016 and 2017. Seven of the fourteen QTLs were repeatedly identified across two years. Using three residual heterozygote-derived populations, a stably inherited QTL named as qGPC1-1 was validated and delimited to a ~862 kb marker interval JD1006–JD1075 on the short arm of chromosome 1. Comparing the GPC values of the RIL population determined by near infrared reflectance spectroscopy (NIRS) and Kjeldahl nitrogen determination (KND) methods, high correlation coefficients (0.966 and 0.983) were observed in 2016 and 2017. Furthermore, 12 of the 14 QTLs were identically identified with the GPC measured by the two methods. These results indicated that instead of the traditional KND method, the rapid and easy-to-operate NIRS was suitable for analyzing a massive number of samples in mapping and cloning QTLs for GPC. Using the gel-based low-density map consisted of 208 simple sequence repeat (SSR) and insert/deletion (InDel) markers, the same number of QTLs (fourteen) were identified in the same HHZ/JZ1560 RIL population, and three QTLs were repeatedly detected across two years. More stably expressed QTLs were identified based on the genome resequencing, which might be attributed to the high-density map, increasing the detection power of minor QTLs. Our results are helpful in dissecting the genetic basis of GPC and improving rice grain qualities through molecular assisted selection.


Introduction
Rice grain quality, including appearance, milling, cooking and eating, as well as nutritional qualities, determines the market value, and is getting more and more concern from rice researchers, derived from a cross between an indica variety Huanghuazhan (HHZ) and a japonica accession Jizi1560 (JZ1560). Third, we compared the GPC QTLs identified with the high-and low-density map in the same HHZ/JZ1560 RIL population. Finally, one stably inherited major QTL (qGPC1-1) located on the short arm of chromosome 1 was validated using three secondary populations developed from three residual heterozygotes (RH) with the heterozygous genotype at the target interval.

Phenotypic Variation and Correlation Analysis
In 2016 and 2017, the GPC of the HHZ/JZ1560 RIL population and the parents was measured by NIRS and KND, respectively. The descriptive statistics of phenotypic variation are shown in Table 1. Significant phenotypic differences were observed between the parents, and the GPC of JZ1560 was higher than that of HHZ in both years. Frequency distributions of GPC in the brown rice flour of RILs and the parents were plotted (Figure 1). The GPC of the RILs showed similar distribution either measured by the KND or NIRS method in each year. Phenotypic variation was continuously distributed with low skewness and kurtosis, showing a typical pattern of quantitative variation. Using the KND method, the GPC of the RILs showed a wide variation from 7.27% to 15.78% in 2016 and from 7.39% to 15.98% in 2017 ( Figure 1). Continuous segregation and significant transgressive segregation at two directions suggested polygenic control underlying this trait.

Phenotypic Variation and Correlation Analysis
In 2016 and 2017, the GPC of the HHZ/JZ1560 RIL population and the parents was measured by NIRS and KND, respectively. The descriptive statistics of phenotypic variation are shown in Table 1. Significant phenotypic differences were observed between the parents, and the GPC of JZ1560 was higher than that of HHZ in both years. Frequency distributions of GPC in the brown rice flour of RILs and the parents were plotted (Figure 1). The GPC of the RILs showed similar distribution either measured by the KND or NIRS method in each year. Phenotypic variation was continuously distributed with low skewness and kurtosis, showing a typical pattern of quantitative variation. Using the KND method, the GPC of the RILs showed a wide variation from 7.27% to 15.78% in 2016 and from 7.39% to 15.98% in 2017 ( Figure 1). Continuous segregation and significant transgressive segregation at two directions suggested polygenic control underlying this trait.  Significant positive correlations between the GPC values measured by the two methods or in different years were found in the RIL population (Table 2). However, correlation coefficients for GPC determined by KND and NIRS methods as 0.966 in 2016 and 0.983 in 2017 were much higher than that of GPC measured with the same method in different years as 0.638 for NIRS and 0.631 for KND. This implied that GPC determined by NIRS was quite consistent with that by the KND method, and GPC was strongly influenced by environmental factors. Significant positive correlations between the GPC values measured by the two methods or in different years were found in the RIL population (Table 2). However, correlation coefficients for GPC determined by KND and NIRS methods as 0.966 in 2016 and 0.983 in 2017 were much higher than that of GPC measured with the same method in different years as 0.638 for NIRS and 0.631 for KND. This implied that GPC determined by NIRS was quite consistent with that by the KND method, and GPC was strongly influenced by environmental factors. Table 2. Correlation coefficients of GPC between two methods and between different years from 280 RILs.

QTL Analysis of Grain Protein Content in the HHZ/JZ1560 RIL Population
Combining the high-density genetic map containing 18,194 SNP markers with the GPC means of each RIL, 14 QTLs were detected on the whole genome except for chromosomes 6, 9 and 12 with each QTL explaining 0.81%-18.59% of the phenotypic variations (Table 3, Figures 2 and 3). Among the QTLs, 12 were identified in 2016 and nine in 2017. The detailed description of each QTL including peak location, peak LOD value, additive effect and percentage of total phenotypic variations (R 2 ) are showed in Table 3. Except for qGPC2, qGPC8 and qGPC10, the enhancing alleles for GPC were derived from JZ1560 at the remaining 11 loci as the brown rice of JZ1560 contained significantly higher GPC (Table 1).

QTL Analysis of Grain Protein Content in the HHZ/JZ1560 RIL Population
Combining the high-density genetic map containing 18,194 SNP markers with the GPC means of each RIL, 14 QTLs were detected on the whole genome except for chromosomes 6, 9 and 12 with each QTL explaining 0.81%-18.59% of the phenotypic variations (Table 3, Figures 2 and 3). Among the QTLs, 12 were identified in 2016 and nine in 2017. The detailed description of each QTL including peak location, peak LOD value, additive effect and percentage of total phenotypic variations (R 2 ) are showed in Table 3. Except for qGPC2, qGPC8 and qGPC10, the enhancing alleles for GPC were derived from JZ1560 at the remaining 11 loci as the brown rice of JZ1560 contained significantly higher GPC (Table 1).    In order to find the difference in the detection power of QTLs using different measurement methods of GPC, we further compared the QTLs for GPC determined by the NIRS and KND methods. In 2016, we identified 12 QTLs for GPC measured by NIRS and 11 QTLs for GPC measured by KND, which explained 55.07% and 48.25% of the total phenotypic variations, respectively. Eleven QTLs were commonly mapped using the two measurement methods, and one QTL (qGPC10) with a small genetic effect was only detected using the NIRS method (Figures 2 and 3, Table 3). Similar results were observed in 2017. Eight common QTLs were identified by both the measurement methods, and only one minor QTL (qGPC11) was mapped on chromosome 11 using the NIRS method. This indicated that QTLs for GPC were coincided between the two GPC measurement methods. Seven of the fourteen QTLs, qGPC1-1, qGPC1-2, qGPC3-1, qGPC3-2, qGPC4, qGPC5 and qGPC8, were repeatedly identified in both years. The remaining seven QTLs were only detected in one year.
QTL analysis was also performed using a low-density genetic map containing 208 SSR and InDel markers and a total of 14 QTLs were detected in the same HHZ/JZ1560 RIL population ( Figure 4, Table S1). Compared with the seven stably inherited QTLs identified in the high-density genetic map, only three QTLs including qGPC1, qGPC3-1 and qGPC5 were mapped at the same region and showed the similar effects for the two years using the low-density genetic map, suggesting that the highdensity genetic map increased the detection power of QTLs for GPC. In order to find the difference in the detection power of QTLs using different measurement methods of GPC, we further compared the QTLs for GPC determined by the NIRS and KND methods. In 2016, we identified 12 QTLs for GPC measured by NIRS and 11 QTLs for GPC measured by KND, which explained 55.07% and 48.25% of the total phenotypic variations, respectively. Eleven QTLs were commonly mapped using the two measurement methods, and one QTL (qGPC10) with a small genetic effect was only detected using the NIRS method (Figures 2 and 3, Table 3). Similar results were observed in 2017. Eight common QTLs were identified by both the measurement methods, and only one minor QTL (qGPC11) was mapped on chromosome 11 using the NIRS method. This indicated that QTLs for GPC were coincided between the two GPC measurement methods. Seven of the fourteen QTLs, qGPC1-1, qGPC1-2, qGPC3-1, qGPC3-2, qGPC4, qGPC5 and qGPC8, were repeatedly identified in both years. The remaining seven QTLs were only detected in one year.
QTL analysis was also performed using a low-density genetic map containing 208 SSR and InDel markers and a total of 14 QTLs were detected in the same HHZ/JZ1560 RIL population (Figure 4, Table S1). Compared with the seven stably inherited QTLs identified in the high-density genetic map, only three QTLs including qGPC1, qGPC3-1 and qGPC5 were mapped at the same region and showed the similar effects for the two years using the low-density genetic map, suggesting that the high-density genetic map increased the detection power of QTLs for GPC. The qGPC1-1 was detected on the short arm of chromosome 1 across two years using the highdensity genetic map and accounted for 9.14% to 11.85% of the phenotypic variations. The allele from  The qGPC1-1 was detected on the short arm of chromosome 1 across two years using the high-density genetic map and accounted for 9.14% to 11.85% of the phenotypic variations. The allele from JZ1560 at this locus increased GPC by 0.47%-0.58%. Corresponding to qGPC1-1, qGPC1 was mapped at the same location with the flanking markers JD1006 and JD1007 using the low-density genetic map in both years ( Figure 4, Table S1). The qGPC1 identified in this study contributed 11.78% to 13.33% of phenotypic variations with a relatively large additive effect ranging from 0.54% to 0.71% (Table S1).

Validation and Delimitation of qGPC1-1 Using RH-derived F 2 Populations
To confirm the genetic effect and location of qGPC1-1, three RH individuals were selected from one F 8 RIL line with the heterozygous genotype covering the target marker interval of JD1006-JD1007. Three RH-derived F 2 populations, named as WB01, WB02 and WB03, were developed from the three plants with sequential heterozygous segments extending from JD1006 to JD1007, respectively. Based on the sequence differences between the parents HHZ and JZ1560 identified by 30-fold whole genome re-sequencing, additional six InDel markers were developed and used to genotype the three populations (Table S2). GPC was continuously distributed and ranged from 8.33% to 11.90%, 8.27% to 12.19% and 8.32% to 10.47% in WB01, WB02 and WB03 populations, respectively ( Figure S1).
Three segmental linkage maps were constructed for WB01, WB02 and WB03, respectively ( Figure 5). Combined the genotype and phenotype information, qGPC1-1 was identified in WB01 and WB02 populations, with the JZ1560 allele always increasing GPC (Table 4; Figure 5). This QTL explained 26.00% and 27.40% of phenotypic variations with the similar additive effects of 0.36% and 0.39% in WB01 and WB02, respectively. No QTL for the GPC was detected in the WB03 population. Therefore, qGPC1-1 should be located within the common segregating regions of WB01 and WB02, but outside the segregating region of WB03. As shown in Figure 5, qGPC1-1 was delimited to the interval between JD1006 and JD1075 (~862 kb) with a common segregating region JD1068-JD1075 and one flanking cross-over region JD1006-JD1068. JZ1560 at this locus increased GPC by 0.47%-0.58%. Corresponding to qGPC1-1, qGPC1 was mapped at the same location with the flanking markers JD1006 and JD1007 using the low-density genetic map in both years ( Figure 4, Table S1). The qGPC1 identified in this study contributed 11.78% to 13.33% of phenotypic variations with a relatively large additive effect ranging from 0.54% to 0.71% (Table S1).

Validation and Delimitation of qGPC1-1 Using RH-derived F2 Populations
To confirm the genetic effect and location of qGPC1-1, three RH individuals were selected from one F8 RIL line with the heterozygous genotype covering the target marker interval of JD1006-JD1007. Three RH-derived F2 populations, named as WB01, WB02 and WB03, were developed from the three plants with sequential heterozygous segments extending from JD1006 to JD1007, respectively. Based on the sequence differences between the parents HHZ and JZ1560 identified by 30-fold whole genome re-sequencing, additional six InDel markers were developed and used to genotype the three populations (Table S2). GPC was continuously distributed and ranged from 8.33% to 11.90%, 8.27% to 12.19% and 8.32% to 10.47% in WB01, WB02 and WB03 populations, respectively ( Figure S1).
Three segmental linkage maps were constructed for WB01, WB02 and WB03, respectively ( Figure 5). Combined the genotype and phenotype information, qGPC1-1 was identified in WB01 and WB02 populations, with the JZ1560 allele always increasing GPC (Table 4; Figure 5). This QTL explained 26.00% and 27.40% of phenotypic variations with the similar additive effects of 0.36% and 0.39% in WB01 and WB02, respectively. No QTL for the GPC was detected in the WB03 population. Therefore, qGPC1-1 should be located within the common segregating regions of WB01 and WB02, but outside the segregating region of WB03. As shown in Figure 5, qGPC1-1 was delimited to the interval between JD1006 and JD1075 (~862 kb) with a common segregating region JD1068-JD1075 and one flanking cross-over region JD1006-JD1068.

Discussion
Elucidating the genetic mechanism of GPC accumulation is very important for regulating rice grain qualities in breeding. In the present study, we characterized the genetic basis of GPC and identified a total of 14 QTLs using the high-resolution map in the HHZ/JZ1560 RIL population. Although GPC is sensitive to environmental conditions and the QTLs for GPC are difficult to be repeatedly identified in different environments, the majority of the 14 QTLs have been reported in the previous studies. On the long arm of chromosome 1, QTLs for GPC have been reported in some studies including pro1 between RM226 and RM297 [15], qPC-1 between R888 and R1485 [13], qPC1 between RM472 and RM104 [6], qPC-1 between RM1196 and RM302 [19] and TGP1b between RM1297 and RM1067 [4]. The qGPC1-2 and qGPC1-3 were located in the adjacent chromosome regions with these reported QTLs, and the indica variety HHZ allele decreased the GPC. Two minor QTLs, qGPC2 and qGPC11-1, were also mapped at the similar locations with pro2 between RM6 and RM112 and pro11 between RM209 and RM229 [15]. On the chromosome 3, we detected three QTLs including qGPC3-1, qGPC3-2 and qGPC3-3, and the enhancing alleles were all from the japonica rice JZ1560. The qGPC3-1 with the largest effect in 2016 and qGPC3-2 were repeatedly detected as qPC-3.1 in the interval XNpb212-G1318 and qPC-3.2 in the interval R758-XNpb15, respectively [13]. The qGPC3-3 was located in the overlapping confidence interval with the QTLs for protein content in several previous reports [13,14,18]. These results indicated that there are multiple genetic factors controlling GPC on chromosome 3. The qGPC4 was detected within the qPC-4 region between RG214 and RG620 [12]. We still noted that qGPC5 was repeatedly identified as qRPC5 for rice protein content in the interval RG435-RG172a using a doubled haploid population [16]. Although qGPC7 was only detected in 2016, four QTLs were located in the same or adjacent regions as reported by previous studies [3,4,8,17]. On the chromosome 8, qGPC8 showed overlapping intervals with cp8.1, qPC-8a and TGP8, which have been identified using different populations in different environments [4,18,24]. The qPC11-2 with minor effect was mapped near to qPC11 between RM202 and RM206 [25]. No QTL for GPC has been mapped in the region of qGPC10 on chromosome 10 before, therefore qGPC10 might be newly detected in this study. Unlike many previous studies, we did not detect the QTL for GPC near to the Wx locus on chromosome 6 [4,8,12,25]. Over all, the locations of QTLs for GPC showed a significant similarity between our studies and the previous findings.
Of the 14 QTLs identified in this study, the qGPC1-1 was a stably expressed QTL with a relatively large effect and it was repeatedly detected in both years. This QTL was also identified in the similar chromosome region in different populations and environments, implying that qGPC1-1 plays an important role in controlling GPC [4,11,18]. Based on the primary mapping result, qGPC1-1 was further validated and delimited in the interval JD1006-JD1075, corresponding to the 6.0-6.8 Mb region on the short arm of chromosome 1 in the Nipponbare genome [26]. The japonica rice JZ1560 allele contributed to the increase of GPC in the RH-derived F 2 population (Table 4). Dissecting the genetic mechanism underlying GPC is important for the improvement of rice grain quality, and the main obstacle to date is the absence of key genes/QTLs regulating GPC. Primary mapping leads to a large confident interval and poor repeatability of target QTLs, which makes it difficult to find tightly linked markers for marker-assisted selection. Validation and delimitation of qGPC1-1 contributed to the facilitation of marker-assisted selection in rice breeding for high nutritional quality. Furthermore, based on these results, fine mapping and map-based cloning of qGPC1-1 is under way.
Mapping and isolation of QTLs need a high efficiency method to measure GPC. Cloning of QTLs controlling the natural variation of GPC is the important step toward uncovering the regulatory mechanism underlying this quantitative trait. However, map-based cloning of a QTL needs phenotype and genotype information of a massive number of samples, suggesting a rapid and easy operation method for GPC is necessary. Using the NIRS system, the GPC value can be directly measured once the brown rice is grinded into flour.
Compared with the NIRS, the KND method for GPC in rice needs further lengthy operation, which is time-consuming and laborious. More importantly, high correlation between the GPC values determined by the NIRS and KND methods was observed in the HHZ/JZ1560 RIL population across two years. Only two minor QTLs (qGPC10 and qGPC11-1) were detected using the single measurement method of GPC, the remaining twelve QTLs were identical to be identified by both methods in 2016 or 2017 (Figures 2 and 3, Table 3). Comparative analysis between the two methods suggested that NIRS could be a feasible strategy for the mapping and map-based cloning of QTLs for GPC instead of the KND method. In recent years, NIRS has been successfully employed in the isolation and characterization of two major QTLs for GPC, qPC1 and qGPC-10 [6,7].
Accompanied with the development of DNA sequence techniques, the sequencing cost decreases continuously and more and more high-density genetic maps have been constructed to detect QTLs for different traits through genotyping-by-sequencing [21,27,28]. In the present study, QTLs for GPC were mapped simultaneously using a high-density genetic map with 18,194 SNP markers identified by GBR and a low-density map with 208 gel-based SSR and InDel markers in the same RIL population. Although the same number of QTLs were identified using the different resolution genetic maps, 7 of 14 and 3 of 14 QTLs were repeatedly detected across two years in the high-and low-density genetic maps, respectively. Identification of more stably expressed QTLs might be attributed to the increased detection power resulted by the saturated SNP markers in the high-density genetic map.

Plant Materials and Field Experiments
The HHZ/JZ1560 mapping population consisting of 280 RILs was developed using the single-seed descendent method by Ying et al. [29]. HHZ is an indica variety as the female parent with small grains and being widely cultivated in China, while JZ1560 is a japonica rice accession with very large grains ( Figure S2). Three RH-derived F 2 populations (WB01, WB02 and WB03) consisting 180, 137 and 115 individuals, respectively, were originated from 3 RH plants that were selected from one line of the F 7 HHZ/JZ1560 RIL population. The three populations were used to validate the target QTL, qGPC1-1.
All the populations, together with the two parents, were planted with the spacing of 16.7 cm between plants and 26.7

GPC Measurement
GPC of brown rice flour for each RIL was carefully measured by the KND and NIRS methods, respectively. Rice grains were harvested after maturity and stored at room temperature for at least three months before the GPC measurement. The fully filled grains were de-hulled into brown rice by a rice huller (JLGJ-45, Taizhou, Zhejiang, China), then the brown rice was grinded into flour by a grinder (IKA TUBE-MILL, Staufen, Germany). Brown rice flour samples were directly used to determine GPC by NIRS (Foss, Sweden). Each sample was assayed twice and the mean values were used for further analysis. GPC was calculated according to the modified NIRS model constructed by Xie et al. [23].
For the KND method to measure GPC, 0.2 g brown rice flour, 1.0 g catalyst (CuSO 4 :Na 2 SO 4 = 1:10) and 4.0 mL of H 2 SO 4 were first added into a 100-mL digestion tube in turn. Then the mixture was immediately heated at 420 • C for 2 h in a digestion stove. After the solution being digested, the mixture was cooled to room temperature. Then, 10 mL ddH 2 O was added into the digestion tube. The mixed solution was analyzed using a Kjeltec 8400 Autoanalyzer (Foss, Sweden). GPC of the rice flour was calculated from the total N content by multiplying a conversion factor of 5.95 [30]. The assay for each sample was conducted with two replicates and the means were used for further analysis.

Genetic Map and DNA Marker Analysis
To compare the detection power of GPC QTL, high-density and low-density linkage maps of the RIL population were both used to map QTL, respectively. The high-density linkage map constructed by one of the GBR method, specific length amplified fragment, was composed of 18,194 SNP markers and spanned 2132.56 cM with an average genetic distance of 0.12 cM. In our previous study, we constructed the SLAF library for each RIL and the products were further sequenced using Illumina HiSeq 2500 system (Illumina, Inc.; San Diego, CA, USA) [29]. Polymorphism loci between the parents were identified for the selection of high-quality SNP markers after filtering out the low-quality raw reads. SNP markers with more than 20% missing data and the segregation distortion were further filtered out. A total of 18,194 high quality SNP markers were used to genotype the 280 RILs. The low-density linkage map constructed by the gel-based method consisted of 121 SSR and 87 InDel markers and spanned 1399.40 cM with an average distance of 7.61 cM.
For the three RH-derived populations, leaf samples of each individual were extracted for genomic DNA through the modified CTAB method [31]. To genotype the three RH-derived populations, a total of six InDel markers (Table S2) in the mapping interval of qGPC1-1 were designed with Primer3.0 (http://primer3.ut.ee/) based on the 30-fold genome resequencing of the parents, HHZ and JZ1560. According to our previous study [32], the PCR was performed in 10-µL reactions containing 2 × Taq MasterMix (CW0682, CWBIO) 5-µL, 0.4 µM of each primer and 50 ng DNA template. The PCR program was set as an initial denaturation at 94 • C for 2 min, then followed by 30 cycles of 30 s at 94 • C, 30 s at 55 • C and 30 s at 72 • C, and finally 2 min at 72 • C. The PCR products were analyzed on 2.5% agarose gels.

QTL Mapping and Statistical Analysis
Using the high-density genetic map, QTL analysis for GPC was performed by R/qtl software with the method of composite interval mapping (CIM) [33], taking the two years for the RIL population as two environments. A threshold of LOD > 2.0 was used for detecting a putative QTL. Using the low-density genetic map, QTLs controlling GPC were identified by Windows QTL Cartographer 2.5 [34]. Identification of QTLs was performed using CIM. The LOD threshold for claiming a QTL was also fixed as a LOD score of 2.0. QTL analysis in the three RH-derived populations was also conducted by CIM using Windows QTL Cartographer 2.5 [34].
Basic descriptive statistics, including mean, standard deviation, coefficient of variation, range, skewness and kurtosis, and correlation analysis for GPC in the RIL population were performed with Microsoft Excel 2016 for Windows. The t-test of the two parents was performed with the SAS program [35].

Conclusions
Based on genotyping-by-resequencing, a total of 14 QTLs controlling GPC were identified with an indica/japonica (HHZ/JZ1560) RIL population in 2016 and 2017. Of the 14 QTLs, 13 QTLs showed similar chromosome regions with the QTLs for GPC documented in previous studies. The qGPC10 with a minor effect was newly detected in this study. Seven of the fourteen QTLs were repeatedly identified across two years. The stably inherited qGPC1-1 with a relatively large effect was validated and delimited to a~862 kb region flanked by JD1006 and JD1075 on the short arm of chromosome 1, which is helpful for the construction of a marker-assisted selection system to improve rice grain qualities and further map-based cloning of this QTL. Our results indicated that instead of the KND method with lengthy operation, the NIRS with rapid and easy operation was a feasible strategy for measuring a massive collection of samples in the mapping and map-based cloning of GPC QTL. More stably expressed QTLs identified in the genetic map based on genotyping-by-resequencing suggested that high-density map enhanced the detection power of minor QTLs.