Multiple Genome Wide Association Mapping Models Identify Quantitative Trait Nucleotides for Brown Planthopper (Nilaparvata lugens) Resistance in MAGIC Indica Population of Rice

Brown planthopper (BPH), one of the most important pests of the rice (Oryza sativa) crop, becomes catastrophic under severe infestations and causes up to 60% yield loss. The highly disastrous BPH biotype in the Indian sub-continent is Biotype 4, which also known as the South Asian Biotype. Though many resistance genes were mapped until now, the utility of the resistance genes in the breeding programs is limited due to the breakdown of resistance and emergence of new biotypes. Hence, to identify the resistance genes for this economically important pest, we have used a multi-parent advanced generation intercross (MAGIC) panel consisting of 391 lines developed from eight indica founder parents. The panel was phenotyped at the controlled conditions for two consecutive years. A set of 27,041 cured polymorphic single nucleotide polymorphism (SNPs) and across-year phenotypic data were used for the identification of marker–trait associations. Genome-wide association analysis was performed to find out consistent associations by employing four single and two multi-locus models. Sixty-one SNPs were consistently detected by all six models. A set of 190 significant marker-associations identified by fixed and random model circulating probability unification (FarmCPU) were considered for searching resistance candidate genes. The highest number of annotated genes were found in chromosome 6 followed by 5 and 1. Ninety-two annotated genes identified across chromosomes of which 13 genes are associated BPH resistance including NB-ARC (nucleotide binding in APAF-1, R gene products, and CED-4) domain-containing protein, NHL repeat-containing protein, LRR containing protein, and WRKY70. The significant SNPs and resistant lines identified from our study could be used for an accelerated breeding program to develop new BPH resistant cultivars.

So far, no previous report is available for BPH resistance through the genome-wide association mapping (GWAS) approach in rice using high-density single nucleotide polymorphisms (SNPs). Here, we used a MAGIC indica population to identify the marker-trait associations through multiple GWAS models for BPH resistance using high-resolution SNP data. Six GWAS models were employed for the identification of significant SNPs and the results of the models were compared. The GWAS approach identified quantitative trait nucleotides (QTNs) associated with resistance genes and the results are discussed with reference to their utility in the rice breeding programmes.

Phenotyping, Scoring and Data Analysis
The Standard Seed box Screening Technique (SSST) [23] was used to screen the RiMi panel for BPH resistance in the glass-house facility at ICAR-the Indian Institute of Rice Research, Hyderabad, during the rainy seasons of 2016 and 2017 in an augmented complete block design. SSST is a rapid and efficient method to screen a large set of rice genotypes for the identification of

Phenotyping, Scoring and Data Analysis
The Standard Seed box Screening Technique (SSST) [23] was used to screen the RiMi panel for BPH resistance in the glass-house facility at ICAR-the Indian Institute of Rice Research, Hyderabad, during the rainy seasons of 2016 and 2017 in an augmented complete block design. SSST is a rapid and efficient method to screen a large set of rice genotypes for the identification of field-resistant cultivars in the greenhouse [24,25]. A commonly used susceptible line, Taichung Native 1 (TN1), was used for the mass rearing of insects [10,23]. Adults were initially collected from the rice fields and the pure colony was maintained in the greenhouse at 30 • C ± 50 • C temperature, and 60 ± 5% relative humidity, which was congenial for the mass multiplication of BPH. The females were released on the 60-day-old potted plants placed in 70 × 75 cm oviposition cages. Plants with egg masses were selected and kept in different cages for nymphal emergence and the second to third instar nymphs were collected for infestation on the RiMi panel.
Around 15 sprouted seeds in each of the 391 RiMi panel lines were seeded in a standard seed tray with a size of 60L × 45W × 10H cm, which was filled with fertilizer-enriched puddled soil. In each tray, 20 MAGIC lines were sown by keeping two genotypes per row with an inter-line spacing of 3 cm. Susceptible check TN1 was covered in the borders and the resistant checks PTB33 and RP2068 were used in the middle rows of the seed tray. The trays with three-day-old seedlings were placed in big plastic trays containing water of 5 cm depth. The phenotyping was started by introducing six to eight second to third instar nymphs into 12 days-old seedlings [25]. The infestation was scored on a 0 to 9 scale [26] when 90% of the TN1 seedlings were succumbed. The augmented block design was conducted with two replications in 10 blocks containing 40 RiMi lines in all the blocks except in one with 31 RiMi lines per replication. Data on three checks (Two resistant and one susceptible) was recorded from every second tray considering the genotypes planted in two trays as one block. The material was evaluated in the augmented block design. The recorded data was subjected to analysis of variance using the R package "augmented RCBD" [27,28].

Genotyping and SNP Calling in GBS Pipeline
The MAGIC panel was genotyped by the Genotyping-By-Sequencing (GBS) approach [29] through Illumina HiSeq. The raw GBS data files (raw genotype files are available at http://snpseek. irri.org/_download.zul) were curated by GBS pipeline using Tassel 3.0.169 [30] by filtering at <30% missing and 0.05 minor allele frequency (MAF) [22]. A set of 27,041 SNPs identified in the subset of the 391 RiMi lines were considered for the present experiment.

Genome-Wide Association Analysis
The curated 27,041 SNPs and BPH phenotypic data of the RiMi panel were used for GWAS analysis using genetic association and prediction integrated tools (GAPIT) developed by [31]. GWAS was performed excluding founders, considering the MAGIC lines as unrelated individuals, since MAGIC population has a negligible population structure [21,32]. Four single-locus models such as the general linear model (GLM) [33], mixed linear model (MLM) [34,35], compressed mixed linear model (CMLM) [36] and settlement of MLM under progressively exclusive relationship (SUPER) [37], and two multi-locus models such as the multi locus mixed model (MLMM) [38] and fixed and random model circulating probability unification (FarmCPU) [36], were engaged to identify the marker-trait association. Bonferroni correction was applied to modify the threshold value and to control the false positive rate (FDR) in four single-locus GWAS. The results from all six models were compared to find out the consistency and repeatability of the associations. SNPs obtained by FarmCPU were considered for the candidate gene search for BPH resistance.

Results and Discussion
Plants are challenged by numerous herbivorous insects invade their natural environment. In consequence, they express diverse substantial and induced defense mechanisms, safeguarding themselves from herbivore attacks [39]. The first BPH resistant rice cultivar is IR26, followed by IR36 and IR50, which are improved by the IRRI. A number of QTNs against BPH in rice have been reported in this study. BPH is the most destructive monophagous, phloem sap-sucking insect, throughout Asia, causing significant yield losses to the rice crop [40,41]. BPH causes an annual estimated economic loss of more than 300 million USD in rice production ecologies of Asia [41,42]. So far, around 38 genes were mapped for BPH by several experiments through biparental mapping approaches. Almost 80% of the resistance genes were mapped in four major clusters on the rice genome using the mapping populations of indica and other wild species. Chromosome 4 has 12 genes that are clustered in three regions (Bph30 and Bph33 in a 0.91-0.97 Mb region between markers H99 and H101; Bph3/17, Bph12, Bph15, Bph20(t), and bph22(t) in a 4. 1-8.9 Mb region between markers RM8212 and B44; and Bph6, bph18(t), Bph27, Bph27(t), and Bph34 in a 19.1-25.0 Mb region between markers RM16846 and RM6506), chromosome 12 L has eight genes (Bph1, bph2, bph7, Bph9, Bph10, Bph18, Bph21 and Bph26) that are clustered together in a 19. 1-24.4 Mb region between markers RM7102 and B122, five genes are on chromosome 6S (Bph3/ Bph32bph4, Bph22, Bph25 and bph29 ) in a 0.2-1.7 Mb region between markers S00310 and RM8101, and chromosome 3 has five genes (Bph13 and bph19(t) on 3S [37], and Bph11, Bph14, and Bph31 3 L) [43][44][45][46][47][48].
Controlling the pest becomes ineffective since surface spray of pesticides could not reach the base of the infestation point. The rampant use of chemicals results in the development of resistance against many insecticides and the evolution of new biotypes. Recently, [41] reported that the excessive use of nitrogen fertilizers in the agricultural fields has an adverse impact on the environment, is valuable, and can advance the pest herbivores. The development of durable resistant varieties is not only an efficient measure to manage the pests, but also to conserve the ecological balance and safety. The BPH population in India, referred to as the South Asian biotype (Biotype 4), is more virulent than those in Southeast Asia (Biotype 2 and 3) [49,50]. Resistance genes identified for the Indian biotype are very few and are unable to offer broad-spectrum resistance. Here, a high-density SNP genotyped MAGIC population was used to improve the resolution of the trait association as well as to find the causal genes for resistance.

Response to Infestation
Screening at the seedling stage is an effective high-throughput method for the rapid and efficient screening of large germplasms as field screening offers inconsistent results due to improper control over the experimentation. A major difficulty in the screening of rice genotypes for BPH tolerance under field conditions is the lack of efficient screening techniques, problems associated with biotype selection, and environmental factors on the uniform density of the insect population. In SSST-based screening, a reliable phenotyping technique, phenotyping was done when all the TN1 (susceptible check) seedlings were dead. The infestation of the RiMI lines was scored on a 0 to 9 scale [26]. The scale of 0 to 3, 3 to 5, 5 to 7 and 7 to 9 were scored as resistant, moderately resistant, moderately susceptible, and susceptible, respectively. The results of the analysis of variance are shown in Table 1. The RiMi panel exhibited a broader variation for a phenotypic score in both the years. The ANOVA revealed a significant mean sum of squares for different sources of variation except for block (eliminating treatments). Frequency distribution-based adjusted mean values are represented in Figure 2. There was a significant skewness towards susceptibility, since the majority of the lines in the panel recorded a 7 to 9 score. This could be attributed to the fact that most of the RiMi lines were segregated for susceptibility. Twenty-three lines recorded a <5.0 score consistently in both of the seasons. TN1 showed complete susceptibility (9 score) whereas resistant checks PTB33 and RP2068 showed scores of less than two and three, respectively.
BPH resistance scores in both the years revealed a significant variation available for insect response in the RiMi panel (Table S2). Lines MIB_4393, MIB_3693, MIB_4077 and MIB_4570 were identified as the best lines for BPH resistance, since the score was less than two, which was equivalent to best check PTB33. Eight RiMi lines, MIB_4370, MIB_3850, MIB_4208, MIB_4241, MIB_4619, MIB_3409, MIB_4134 and MIB_4286, showed a similar response as that of the resistant check RP 2068. Eleven lines were found to be moderately resistant to BPH with the score of <5.0. Moderate susceptibility was recorded by 84 lines. A total of 191 and 93 RiMi lines showed susceptible (score 7) and highly susceptible (score 9) reaction towards BPH incidence.  BPH resistance scores in both the years revealed a significant variation available for insect response in the RiMi panel (Table S2). Lines MIB_4393, MIB_3693, MIB_4077 and MIB_4570 were identified as the best lines for BPH resistance, since the score was less than two, which was equivalent to best check PTB33. Eight RiMi lines, MIB_4370, MIB_3850, MIB_4208, MIB_4241, MIB_4619, MIB_3409, MIB_4134 and MIB_4286, showed a similar response as that of the resistant check RP 2068. Eleven lines were found to be moderately resistant to BPH with the score of <5.0. Moderate susceptibility was recorded by 84 lines. A total of 191 and 93 RiMi lines showed susceptible (score 7) and highly susceptible (score 9) reaction towards BPH incidence.
Though 38 genes were reported for Biotype 4 from landraces and wild species [44,49,51], resistance was very low and donors were inferior in agronomic and quality traits. The popular BPH resistance sources, PTB33 (Bph2 and Bph3 donor) and Rathu Heenati (Bph3 and Bph17 donor), are agronomically inferior, hence identification of superior segregants with farmers' preferred slender grain in the recombinant programme is always a challenge.
The 391 RiMi lines used in the study were selected for superior agronomic and grain quality traits with BPH resistance. Better combinations of segregants for these traits were achieved in the resultant population since MAGIC provides more rounds of recombination by its design and makes it possible to break genetic drag and leads to novel genetic rearrangements with greater genetic diversity [32,52,53].
Among the RiMi lines, the identified MIB_4393, MIB_3693, MIB_4077, and MIB_4570, which are similar to PTB33 in resistance, could be used as new sources of resistance for developing varieties without genetic drag in the breeding programmes. Though 38 genes were reported for Biotype 4 from landraces and wild species [44,49,51], resistance was very low and donors were inferior in agronomic and quality traits. The popular BPH resistance sources, PTB33 (Bph2 and Bph3 donor) and Rathu Heenati (Bph3 and Bph17 donor), are agronomically inferior, hence identification of superior segregants with farmers' preferred slender grain in the recombinant programme is always a challenge.

Marker and Covariates
The 391 RiMi lines used in the study were selected for superior agronomic and grain quality traits with BPH resistance. Better combinations of segregants for these traits were achieved in the resultant population since MAGIC provides more rounds of recombination by its design and makes it possible to break genetic drag and leads to novel genetic rearrangements with greater genetic diversity [32,52,53].
Among the RiMi lines, the identified MIB_4393, MIB_3693, MIB_4077, and MIB_4570, which are similar to PTB33 in resistance, could be used as new sources of resistance for developing varieties without genetic drag in the breeding programmes. RiMi panel was inferred using principal component analysis (PCA). The RiMi panel explained the uniform distribution of alleles without any population structure (Figure 3). Principal components that explained up to 50% of the total variation were included in the GWAS models to correct any uneven allele frequencies. Kinship matrices built in the respective association mapping models were used for the additional correction of false positives.

Identification of SNPs
MAGIC populations developed in several crops including rice serve as potential genetic resources for the identification of QTLs. In the global MAGIC population of rice, 38 and 34 QTLs for yield and grain quality traits were identified through GWAS and interval mapping, respectively [20]. GWAS analysis was conducted using mixed linear models in the MAGIC indica plus population of rice, and 57 significant genomic regions for agronomic and bio-fortification traits were detected [54].
Here, we conducted the GWAS analysis in the MAGIC population using four single-locus models (GLM, MLM, CMLM and SUPER) and two multi-locus models (MLMM and Farm CPU) to identify the common SNPs associated with BPH resistance ( Figure S1). Each model has its own characteristics in terms of statistical power, selection of covariates, grouping of markers, computational power, ability to control false positives and the ability to include true associations, and so on [55]. Considering these factors and the trait complexity, we wanted to engage all these models to find out the marker-trait association, and the results from the different models are compared.
We considered the p value <10 −3 to identify significant SNPs across models and, using this threshold, a total of 248 associated SNPs that were significant for BPH resistance were detected for the mean BPH data. These SNPs were distributed on chromosomes 1, 2, 4, 5, 6, 7, 10, 11, and 12. The  (Table S3).
We have identified a maximum of 213 SNPs by employing the GLM model mapped on chromosomes 1, 2, 4, 5, 6, 7, 10 and 12. GLM detected the highest number of associations since it is the less stringent model among all, and hence, it is prone to detect more false positives. Both the MLM and CMLM methods identified the lowest number of SNPs (91 and 80, respectively) compared to other models. CMLM is an improvised MLM procedure by way of grouping the markers while a

Identification of SNPs
MAGIC populations developed in several crops including rice serve as potential genetic resources for the identification of QTLs. In the global MAGIC population of rice, 38 and 34 QTLs for yield and grain quality traits were identified through GWAS and interval mapping, respectively [20]. GWAS analysis was conducted using mixed linear models in the MAGIC indica plus population of rice, and 57 significant genomic regions for agronomic and bio-fortification traits were detected [54].
Here, we conducted the GWAS analysis in the MAGIC population using four single-locus models (GLM, MLM, CMLM and SUPER) and two multi-locus models (MLMM and Farm CPU) to identify the common SNPs associated with BPH resistance ( Figure S1). Each model has its own characteristics in terms of statistical power, selection of covariates, grouping of markers, computational power, ability to control false positives and the ability to include true associations, and so on [55]. Considering these factors and the trait complexity, we wanted to engage all these models to find out the marker-trait association, and the results from the different models are compared.
We have identified a maximum of 213 SNPs by employing the GLM model mapped on chromosomes 1, 2, 4, 5, 6, 7, 10 and 12. GLM detected the highest number of associations since it is the less stringent model among all, and hence, it is prone to detect more false positives. Both the MLM and CMLM methods identified the lowest number of SNPs (91 and 80, respectively) compared to other models. CMLM is an improvised MLM procedure by way of grouping the markers while a generating kinship relationship that could have reduced SNP associations over MLM. Among the single locus models used in this study, SUPER was the only method that could identify the marker-trait associations on chromosome 11 in addition to chromosomes 1, 2, 4, 5, 6, 7, 10, and 12. SUPER uses a subset of markers to compute the kinship, hence the statistical power is improved over MLM [37].
Since FarmCPU is the comprehensive model among all, we considered the results obtained by FarmCPU for identification of resistance genes associated with the SNPs. The identified 190 significant SNP loci located on chromosomes 1 (20 SNPs), 2 (20 SNPs), 4 (11 SNPs), 5 (39 SNPs), 6 (88 SNPs), 7 (5 SNPs) and 10 (7 SNPs) were used for searching genes in their vicinity ( Figure 4). The Manhattan plots and QQ plots for BPH resistance during 2016, 2017 and for the mean over years from FarmCPU are presented in Figure 5. Manhattan plots and QQ plots of GWAS for the other five models are submitted as Supplementary Figure S2.
Four important SNP clusters were identified on chromosome 1 and 2, where 20 SNPs were distributed in these regions (Table S3). One cluster was located in the 12 to 14.5 Mb region on the short arm of chromosome 1 and three clusters were located in the 1 to 2 Mb, 7.5 to 8.5 Mb and 34 to 35.5 Mb regions on both the short and long arms of chromosome 2. Significant SNPs identified on a long arm of chromosome 2 were mapped near (approximately 2 Mb regions away from) the Bph13 (t) gene reported in O. eichingeri [56] (Figure 4). Eleven significant SNPs that were identified at the 3 to 5.5 Mb region on the short arm of the chromosome 4 were also co-localized with Qbph4, Bph17 [57] and Bph12 [58]. Co-localization of the significant SNPs in the vicinity of the identified resistance gene locations such as Bph13 explained the fact that the founder lines are directly or indirectly descended from any one of those wild species [59].
Two regions with 39 SNPs spanning a region 0.7 to 1.1 Mb (short arm) and 23.0 to 24.0 Mb (long arm) were identified on chromosome 5. It is to be noted that the long arm of chromosome 5 was earlier mapped for Qbph5 [60], indicating the importance of this region for searching resistance gene(s). S1_12742211, S1_13273091, S1_13274423, S1_13365703, S1_13374131, S1_13374151, S1_13424589   Four important SNP clusters were identified on chromosome 1 and 2, where 20 SNPs were distributed in these regions (Table S3). One cluster was located in the 12 to 14.5 Mb region on the short arm of chromosome 1 and three clusters were located in the 1 to 2 Mb, 7.5 to 8.5 Mb and 34 to 35.5 Mb regions on both the short and long arms of chromosome 2. Significant SNPs identified on a long arm of chromosome 2 were mapped near (approximately 2 Mb regions away from) the Bph13 (t) gene reported in O. eichingeri [56] (Figure 4). Eleven significant SNPs that were identified at the 3 to 5.5 Mb region on the short arm of the chromosome 4 were also co-localized with Qbph4, Bph17 [57] and Bph12 [58]. Co-localization of the  Four important SNP clusters were identified on chromosome 1 and 2, where 20 SNPs were distributed in these regions (Table S3). One cluster was located in the 12 to 14.5 Mb region on the short arm of chromosome 1 and three clusters were located in the 1 to 2 Mb, 7.5 to 8.5 Mb and 34 to 35.5 Mb regions on both the short and long arms of chromosome 2. Significant SNPs identified on a long arm of chromosome 2 were mapped near (approximately 2 Mb regions away from) the Bph13 (t) gene reported in O. eichingeri [56] (Figure 4). Eleven significant SNPs that were identified at the 3 to 5.5 Mb region on the short arm of the chromosome 4 were also co-localized with Qbph4, Bph17 [57] and Bph12 [58]. Co-localization of the Chromosome 6 showed one major cluster of SNPs on short arm from 6.0 to 9.0 Mb with 86 SNPs. Bph32 was identified and located 4 Mb away from this region [61]. Five SNPs were clustered on chromosome 7 at a physical distance from 10 to 19.5 Mb and seven closely located SNPs mapped on the short arm of chromosome 10 from the 18 to 19 Mb region. Qbph7 [62] mapped on the long arm of chromosome 7 was found within the identified SNP cluster.
On chromosome 5, eight significant QTNs, S5_23249078, S5_23249119, S5_23249125, S5_23249237, S5_23249605, S5_23310970, S5_23312204 and S5_23314218 are mapped in the genic locations of LOC_Os05g39590 and LOC_Os05g39720 encoding AP2 domain containing protein and WRKY70. These two transcription factors are involved in a stress tolerant mechanism that could contribute to BPH resistance. The AP2 (APETALA2)/EREBP (Ethylene-Responsive Element-Binding Protein) family has an important regulatory function in environmental stress tolerance, that regulates diverse processes of plant development and metabolism. It mainly regulates ABA dependent or ABA independent stress responsive pathways and is also known to be involved in biotic stress response through the Jasmonic acid pathway and in abiotic stress response through the ABA biosynthetic pathway [66]. WRKY, a class of TF, is involved in plant stress responses including biotic stresses [67]. A class of WRKY, known as OsWRKY13, mediated the gene regulation through Salicylate and Jasmonate-dependent signaling pathways in rice disease resistance [68]. The Bph14 resistance gene is associated with WRKY46 and WRKY72 by protecting them from degradation, thus providing resistance [69]. The above-mentioned WRKY TFs provide BPF resistance by over-expressing the RLCK281 and callose synthase genes.
A gene LOC_Os01g24690 is associated with QTN S1_13898444 among 12 annotated genes identified on chromosome 1, encoding 60S ribosomal protein L23A. Ribosomal proteins have multiple functions and differentially regulate at the time of stress conditions [70]. The higher expression of ribosomal proteins is important for the proper functioning of several house-keeping genes during the stress conditions [71] and the QTN associated with such proteins could be useful for BPH resistance.
Allelic status of the significant QTNs that contribute to resistance in the agronomically superior RiMi lines (MIB_4393, MIB_3693, MIB_4077 and MIB_4570) are presented in Table 4. These lines could be used as new sources of resistance in the breeding programs as well as to explore downstream genes in genetic studies.

Conclusions
Over the past few years, substantial developments have been made in rice molecular breeding for high yield, biotic and abiotic stress tolerance, and grain quality, etc. Molecular breeding for BPH pest resistance in rice was, however, constrained because of the complexity of interrelatedness among BPH and rice. BPH is one of the most economically important insect pests of rice. Severe infestation in high-input rice ecologies results in major yield loss in susceptible varieties. Host-plant resistance is the best way to protect the environment and preserve the ecological balance of rice growing areas. The MAGIC panel identified 59 common SNPs on different rice chromosomes through six association mapping models. The gene search associated with QTNs revealed 13 stress tolerance genes that would play a significant role in BPH resistance. The 31 significant QTNs associated with the resistance genes could be used in accelerated marker-assisted breeding to develop BPH resistant varieties. Identified BPH resistant lines could be directly used in varietal release programs based on their agronomic performance and used as donors for BPH resistance to develop new resistant cultivars.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-393X/8/4/608/s1, Figure S1: Graphic representation of physical locations of the common SNPs for BPH resistance identified using single and multi-locus models on all chromosomes, Figure S2: Manhattan plots and Quantile-quantile (QQ) plots of different models for the two test years and across years, Table S1: Pedigree details of 391 R-MI population, Table S2: BPH resistance scores of 391 R-MI population across years, Table S3: SNP trait associations identified by different single and multi-locus GWAS models, Table S4: List of candidate genes (defense related genes) identified in the associated QTN regions.