Genome-Wide Association Study of QTLs Conferring Resistance to Bacterial Leaf Streak in Rice

Bacterial leaf streak (BLS) is a devastating rice disease caused by the bacterial pathogen, Xanthomonas oryzae pv. oryzicola (Xoc), which can result in severe damage to rice production worldwide. Based on a total of 510 rice accessions, trialed in two seasons and using six different multi-locus GWAS methods (mrMLM, ISIS EM-BLASSO, pLARmEB, FASTmrMLM, FASTmrEMMA and pKWmEB), 79 quantitative trait nucleotides (QTNs) reflecting 69 QTLs for BLS resistance were identified (LOD > 3). The QTNs were distributed on all chromosomes, with the most distributed on chromosome 11, followed by chromosomes 1 and 5. Each QTN had an additive effect of 0.20 (cm) and explained, on average, 2.44% of the phenotypic variance, varying from 0.00–0.92 (cm) and from 0.00–9.86%, respectively. Twenty-five QTNs were detected by at least two methods. Among them, qnBLS11.17 was detected by as many as five methods. Most of the QTNs showed a significant interaction with their environment, but no QTNs were detected in both seasons. By defining the QTL range for each QTN according to the LD half-decay distance, a total of 848 candidate genes were found for nine top QTNs. Among them, more than 10% were annotated to be related to biotic stress resistance, and five showed a significant response to Xoc infection. Our results could facilitate the in-depth study and marker-assisted improvement of rice resistance to BLS.


Introduction
Bacterial leaf streak (BLS) is a disease caused by the bacterial pathogen, Xanthomonas oryzae pv. oryzicola (Xoc), in rice. BLS is considered as one of the most destructive diseases in rice-producing regions and can cause severe rice yield loss [1]. It has been proved that the BLS resistance is quantitatively inherited in rice [2]. In general, quantitative disease resistance is controlled by multiple genes, non-race-specific and durable [3]. Therefore, the breeding of disease-resistant cultivars is a desirable approach to manage BLS in rice. To this end, it is necessary to map the quantitative trait loci (QTLs) which confer BLS resistance. Tang et al. (2000) [2] mapped eleven QTLs for BLS resistance on six chromosomes in rice. Zheng et al. (2005) [4] mapped a major QTL for BLS resistance, which could explain 13.7% of the phenotypic variance in the F2 population used. Chen et al. (2006) [5] mapped a major QTL for BLS resistance on chromosome 11. In addition, it was found that a recessive gene, bls1, exhibited a race-specific resistance to BLS [6], and a locus Xo1 conferred a complete Plants 2021, 10, 2039 2 of 12 resistance to the African clade of Xoc strains of BLS [7]. Further, a non-host resistance gene Rxo1 from maize displayed a qualitative resistance to BLS [8] by activating multiple defense pathways related to a hypersensitive response (HR) against Xoc in rice [9].
A genome-wide association study (GWAS) is a powerful alternative approach to mapping QTLs or, more precisely, quantitative trait nucleotides (QTNs). Many statistical methods are developed for GWAS, including CMLM [10], ECMLM [11], mrMLM [12], ISIS EM-BLASSO [13], pLARmEB [14], FASTmrEMMA [15], FASTmrMLM [16], and pKWmEB [17]. In these models, MLM is a mixed linear model (Q+K model); FAST is factored spectrally transformed function; ISIS is iterative-modified sure independence screening; EB is Expectation-Maximization; BLASSO is Bayesian least absolute shrinkage and selection operator; and EMMAX is an efficient mixed-model association expedited function that uses pKWmEB, pLARmEB, the Kruskal-Wallis test and the LARS algorithm, respectively. An R package called mrMLM was developed [18], in which these six multilocus GWAS methods were integrated in the R-based software [19]. Because multi-locus GWAS models are relatively closer to the true genetic models of plants and animals than existing single-locus GWAS methods, these methods appear to display a more robust identification of QTNs with a lower false positive rate (FPR) in the analysis, especially with the small-effect quantitative trait [12,19]. Recently, GWAS is widely used to investigate the diverse and complex traits of plants, such as maize [12], barley [20], Arabidopsis thaliana [21] and rice [22][23][24][25].
Although several QTLs (or genes) for BLS resistance were identified in rice, they were most likely only a small part of a whole because those studies were all performed based on bi-parent crosses, from which only the QTLs (or genes) with an allelic difference between the two parents could be detected. To enhance BLS resistance breeding in rice, it is necessary to investigate a wide range of rice germplasms to map more QTLs related to BLS resistance and to identify the germplasms for better BLS resistance. A high-density single-nucleotide polymorphism (SNP) map and comprehensive HapMap were constructed in rice [26,27], and were valuable for a GWAS of complex traits. Lately, a total of seven genomic loci associated with BLS resistance were detected, based on the mixed linear model (MLM), in one season trial [28]. In our study, by phenotyping a panel of rice accessions in two seasons, we performed a multi-locus GWAS analysis of BLS resistance, aiming to identify a substantial number of loci and SNP markers related to BLS resistance, which would accelerate the genetic improvement of BLS resistance in rice.

Population Structure and LD Pattern in the Panel
Both PCA and fastSTRUCTURE analysis indicated that the panel of 510 rice accessions used in this study could be approximately divided into three major groups: indica group, japonica group, and circum-Aus group. In addition, there were some accessions which appeared to be genetic mixtures of the major groups. This multivariate-based partition was principally consistent with the genetic and geographic origins of the rice accessions. General speaking, this panel had a broad representation of rice (Oryza sativa L.). The two populations sampled from the panel for the experiments of different seasons both possessed the features of the panel, displaying very similar population structures ( Figure 1).
Usually, the potential of resolution in GWAS is determined by the degree of LD decay, which is influenced by several factors including population structure, allele frequency, recombination rate and selection. To evaluate the potential of resolution in this study, the relationship of LD with physical distance in the whole panel of rice accessions used was calculated. A diagram of LD (R 2 ) against physical distance clearly displayed the pattern of LD decay in this population ( Figure 2). The attenuation distance of LD reduction from the maximum to 0.1 and then to half of the LD values, which were~230 kb and 330 kb, respectively. Usually, the potential of resolution in GWAS is determined by the degree of LD decay, which is influenced by several factors including population structure, allele frequency, recombination rate and selection. To evaluate the potential of resolution in this study, the relationship of LD with physical distance in the whole panel of rice accessions used was calculated. A diagram of LD (R 2 ) against physical distance clearly displayed the pattern of LD decay in this population ( Figure 2). The attenuation distance of LD reduction from the maximum to 0.1 and then to half of the LD values, which were ~230 kb and 330 kb, respectively.

Variation of Lesion Length
The lesion length showed a wide unimodal continuous distribution skewing to the low-value direction in both populations ( Figure 3). Population 1 showed a more serious skewness and larger variation (variation range = 0.15-7.83 cm, standard deviation = 1.32) than Population 2 (variation range = 0.25-5.13 cm, standard deviation = 0.59). An analysis of variance (ANOVA) of the 385 common accessions of the two populations also showed a very significant variation among the accessions (genotypes) and a very significant interaction between the genotype and environment (Table 1). However, the average variation

Variation of Lesion Length
The lesion length showed a wide unimodal continuous distribution skewing to the low-value direction in both populations ( Figure 3). Population 1 showed a more serious skewness and larger variation (variation range = 0.15-7.83 cm, standard deviation = 1.32) than Population 2 (variation range = 0.25-5.13 cm, standard deviation = 0.59). An analysis of variance (ANOVA) of the 385 common accessions of the two populations also showed a very significant variation among the accessions (genotypes) and a very significant interaction between the genotype and environment (Table 1). However, the average variation caused by environment was not significant (Table 1). In fact, the two populations had very close mean values for BLS lesion (1.24 cm and 1.27 cm, respectively). In addition, there was a high correlation between the two populations (coefficient of correlation = 0.61, p-value < 0.001, calculated based on the 385 common accessions of the two populations). Moreover, the broad sense (mean-based) heritability based on the two-way ANOVA analysis for BLS lesion was analyzed using the data collected across two different environments over 2 years, and these data appeared to be quite encouraging (63.3%).

Variation of Lesion Length
The lesion length showed a wide unimodal continuous distribution skewing to the low-value direction in both populations ( Figure 3). Population 1 showed a more serious skewness and larger variation (variation range = 0.15-7.83 cm, standard deviation = 1.32) than Population 2 (variation range = 0.25-5.13 cm, standard deviation = 0.59). An analysis of variance (ANOVA) of the 385 common accessions of the two populations also showed a very significant variation among the accessions (genotypes) and a very significant interaction between the genotype and environment (Table 1). However, the average variation caused by environment was not significant (Table 1). In fact, the two populations had very close mean values for BLS lesion (1.24 cm and 1.27 cm, respectively). In addition, there was a high correlation between the two populations (coefficient of correlation = 0.61, pvalue < 0.001, calculated based on the 385 common accessions of the two populations). Moreover, the broad sense (mean-based) heritability based on the two-way ANOVA analysis for BLS lesion was analyzed using the data collected across two different environments over 2 years, and these data appeared to be quite encouraging (63.3%).

QTNs Associated with BLS Resistance
Using six different multi-locus GWAS methods, a total of 79 QTNs for BLS resistance were detected (LOD > 3) in the two seasons (populations), with 29 in the first season and 50 in the second season, respectively, and none were detected in both seasons (Supplementary Table S1). The QTNs were distributed on all chromosomes, with the most found on chromosome 11 (17 QTNs), followed by chromosomes 1 and 5 (9 QTNs each), and the fewest found on chromosomes 9 and 10 (3 QTNs only) ( Figure 4). Most of the QTNs were detected by one method only, but 25 were detected by two or more methods simultaneously. Among them, one QTN (qnBLS11.17) was detected by five methods and two QTN (qnBLS5.5 and qnBLS12.2) were detected by four methods (Supplementary Table S1). The additive effect of individual QTN varied from 0.00-0.92 (cm), with an average of 0.17 (cm), explaining 2.44% of the phenotypic variance, on average ranging 0.00-9.86% (Note: for a QTN detected by two or more methods, the additive effect and the percentage of the explained variance were both indicated by the mean of estimates obtained by different methods; Supplementary Table S1).   Among the 79 QTNs, 37 met at least one of the following conditions: (1) LOD > 5; (2) detected by at least two methods; and (3) an additive effect > 0.45 (Table 2). These QTNs were likely to be more reliable (based on the first and second conditions) and/or potentially more useful for breeding (based on the third condition). The statistically most significant QTN was qnBLS8.5, which was detected by three different methods in the first season. This QTN showed the highest values of LOD score (7.78) and the largest additive effect (0.92), explaining~3% of the phenotypic variance (Table 2; Supplementary Table S1). In addition, 1/3 of these 37 QTNs (including qnBLS8.5) displayed a significant (P < 0.05) interaction with the environment (Table 2). To evaluate the reliability of the QTNs detected, 9 top QTNs, including five which had the top LOD scores (qnBLS8.5, qnBLS11.14, qnBLS11.17, qnBLS5.1 and qnBLS3.3) and five which had the top additive effects (qnBLS9.3, qnBLS9.1, qnBLS4.1, qnBLS8.4 and qnBLS8.5), were individually reexamined with a t-test. The results demonstrated a significant difference in lesion length between the two genotypes of each QTN, validating the association of these QTNs with BLS resistance ( Figure 5).

Corresponding QTLs and Candidate Genes
Based on the half attenuation distance of LD decay, the range from the upstream (330 kb) to the downstream (330 kb) of a QTN could be empirically defined as the range of QTL corresponding to the QTN. However, it is noteworthy, that there were six pairs of adjacen QTNs detected by different methods but located closely (in a distance of less than half the attenuation distance). They were: qnBLS1.4 vs. qnBLS1.5 (19. Table S1). In addition, although no QTN was de tected in both seasons simultaneously, there were four pairs of QTNs that were detected in different seasons but located close together. They were: qnBLS4.1 vs. qnBLS4.2 (94.5 kb apart), qnBLS8.2 vs. qnBLS8.3 (12.3 kb apart), qnBLS11.4 vs. qnBLS11.5 (24.5 kb apart), and qnBLS11.10 vs. qnBLS11.11 (31.8 kb apart; Figure 4, Supplementary Table S1). Considering the influence of the experimental error and the limitation of mapping resolution, two nearby QTNs might reflect the same QTL. Therefore, the 79 QTNs detected might reflect 69 QTLs in total.
Based on the Rice Genome Annotation Project [29] annotation, the potential candidate genes of each QTL could be analyzed. There were a total of 848 annotated genes in the QTL regions of the 9 top QTNs (Supplementary Table S2). Among them, 99 putative For each QTN, the population was divided into two groups according to allele types (N = Reference allele/SNP allele). The X-axis presented the two alleles for each QTN, while the Y-axis presented the lesion length. A t-test was used to measure the phenotypic difference of the haplotypes, and a p-value is shown at the top of the figure.

Corresponding QTLs and Candidate Genes
Based on the half attenuation distance of LD decay, the range from the upstream (330 kb) to the downstream (330 kb) of a QTN could be empirically defined as the range of QTL corresponding to the QTN. However, it is noteworthy, that there were six pairs of adjacent QTNs detected by different methods but located closely (in a distance of less than half the attenuation distance). They were: qnBLS1.  Table S1). In addition, although no QTN was detected in both seasons simultaneously, there were four pairs of QTNs that were detected in different seasons but located close together. They were: qnBLS4.1 vs. qnBLS4.2 (94.5 kb apart), qnBLS8.2 vs. qnBLS8.3 (12.3 kb apart), qnBLS11.4 vs. qnBLS11.5 (24.5 kb apart), and qnBLS11.10 vs. qnBLS11.11 (31.8 kb apart; Figure 4, Supplementary Table S1). Considering the influence of the experimental error and the limitation of mapping resolution, two nearby QTNs might reflect the same QTL. Therefore, the 79 QTNs detected might reflect 69 QTLs in total.
Based on the Rice Genome Annotation Project [29] annotation, the potential candidate genes of each QTL could be analyzed. There were a total of 848 annotated genes in the QTL regions of the 9 top QTNs (Supplementary Table S2). Among them, 99 putative genes were predicted to be possibly related to biotic stress resistance in function, including LRR (Leucine-rich repeat), protein kinase, zinc finger protein, PR protein, glycosyl hydrolase, MYB protein, and so on. Moreover, two predicted genes (LOC_Os09g19330 and LOC_Os11g47447) were annotated as having the bacterial resistance function of the stripe rust resistance protein Yr10.

Discussion
BLS resistance is a quantitative trait controlled by many genes in rice. So far, however, only 13 QTLs for BLS resistance are mapped in rice using the conventional QTL mapping method based on structural mapping populations derived from bi-parent cross [2,4,5]. In this study, based on a set of 510 rice varieties trialed in two seasons and using six different statistical methods, a total of 79 QTNs corresponding to 69 QTLs for BLS resistance were detected, indicating that a GWAS was much more efficient for QTL mapping than the conventional method. The main reason for this was that the conventional method only detected the genetic variation between two parents, while a GWAS detected the genetic variation among hundreds or thousands of varieties.
According to the positions of linked molecular markers, we found that the QTNs, qnBLS2.6, qnBLS3.2, qnBLS5.1 and qnBLS11.1, detected in this study were very close to the reported QTLs, qBblsr2, qBblsr3c, qBblsr5a and qBlsr11 [2], respectively (Figure 4; Supplementary Table S1). This suggested that the four QTNs might have reflected the four known QTLs, validating the reliability of these four QTNs. In particular, the reported QTL qBblsr5a was fine mapped and proved that its effect on BLS resistance was mainly, if not entirely, attributed to the known recessive gene, xa5 [30,31]. Jiang et al. (2021) [28] presented seven BLS resistant QTNs, identified by a GWAS approach, and four of the QTNs (qnBLS6.5, qnBLS6.6, qnBLS8.5 and qnBLS11.7) detected in our study appeared to be collinear with these QTNs. Our QTNs, qnBLS6.5 and qnBLS6.6, were 926 kb and 387 kb, respectively, from the QTN detected in their study challenged by the pathogen strain, HNBI-19, while the QTN qnBLS11.7 identified by two models was 146 kb from the QTN in chromosome 11. Moreover, the QTN, qnBLS8.5, which had the highest LOD score and was detected by three different models in our study, was only 21 kb away from the QTN obtained by the GWAS study using the same strain [28].
Most of the rice accessions trialed in the two seasons were the same in this study. However, there were many more QTNs detected in the second season than in the first season, and no common QTNs were detected in both seasons. According to the QTN positions, only four possible QTLs were commonly detected in the two seasons. These results suggested that the QTNs (QTLs) detected by GWAS were very unstable across the environments. In contrast, the conventional method usually had a much higher repeatability in QTL detection. In the study of Tang et al. (2000) [2], 6 out of 11 QTLs for BLS resistance were detected across two years. Compared with the structural populations for conventional QTL mapping, the natural populations of a GWAS usually contain many more QTLs, and the allelic frequency of a QTL often significantly deviated from 0.5. Therefore, each QTL usually had a very small heritability and was easily influenced by environmental variation in the natural populations. This implied that the QTLs detected by GWAS might have had a higher rate of false positives. This was probably a cost of the high efficiency of GWAS. Therefore, both the conventional method and GWAS had advantages and shortcomings; one could not replace the other.
The QTNs for BLS resistance were detected on all chromosomes in this study, but chromosome 11 had the largest number, many more than on any other chromosomes ( Figure 4). This result appeared to be consistent with the findings that disease resistance (R) or R-like genes were enriched on chromosome 11 in rice [32][33][34][35]. It was proposed that the quantitative resistance loci (QRLs) were probably "defeated" versions of R genes with weaker phenotypes [36]. Indeed, for example, many QRLs for rice blast resistance were mapped in the same or similar locations as known R genes [24,[37][38][39]. In Brassica napus, a whole genome study showed that 204 nucleotide binding site (NBS)-encoding genes were located in the QRLs of three diseases (blackleg, clubroot and Sclerotinia stem rot), with most in the QRLs resistant against a single disease, 20% in the QRLs resistant against two diseases, and three genes in the QRLs resistant against all the three diseases [40]. Apart from the possibility of "defeated" versions of R genes, there could be another scenario where an R gene functions as a major gene for one disease, but as a minor gene (QRL) for another disease. For instance, xa5, is a recessive major resistance gene against bacterial blight in rice [41,42], but it is also the most possible candidate gene of the QTL, qBlsr5a, conferring BLS resistance [30].
In a previous study, we identified 157 differentially expressed genes (DEGs) responding to the infection of the BLS pathogen (Xoc) in the rice cultivar, Nipponbare, using RNA-seq [31]. Among these DEGs, two (LOC_Os05g02060 and LOC_Os05g02070), one (LOC_Os09g12660) and two (LOC_Os11g47600 and LOC_Os11g47680) were found to be included in the lists of candidate genes for the top QTNs (QTLs); from this study, they were qnBLS5.1, qnBLS9.1 and qnBLS11.17, respectively (Supplementary Table S2). According to the annotation, LOC_Os09g12660 encoded the glucose-1-phosphate adenylyl transferase large subunit and functioned as the chloroplast precursor, catalyzing the synthesis of the activated glycosyl donor, ADP-glucose, from Glc-1-P and ATP, which is essential for the starch synthesis of leaf chloroplasts [43]. The response of LOC_Os09g12660 to Xoc infection was also observed in an earlier study [9]. LOC_Os11g47600 belongs to the glycosyl transferase multigene family, which is found to be widely involved in stress responses [44,45] and related to the hypersensitive response (HR) to the pathogen, Pseudomonas syringae pv. tomato (Pst-AvrRpm1), in Arabidopsis [44,46]. LOC_Os11g47680 encodes a thaumatin family domain containing the protein (TLP), which shows elite properties in biotic and abiotic resistances [47][48][49]. The rice TLP gene exhibits an antifungal effect after it is overexpressed in banana [47]. The expression of TLP in rice also responds to a Rhizoctonia solani attack and environmental signals [48]. Taken together, these candidate genes potentially cause BLS resistance effects in the corresponding QTLs.

Mapping Panel and Field Experiments
A panel of 510 diverse rice accessions with known SNP genotypes from the public data of 3K Rice Genomes Project (2014) were used in this study (Supplementary Table S3), including 177 japonica, 251 indica, 65 circum-Aus group (cA), 3 circum-Basmati group (cB), and 14 admixed (between major groups), according to Wang et al. (2018) [27]. Two field experiments were performed in the Experimental Farm of Fujian Agriculture and Forestry, University at Yangzhong, Fujian Province (E118.485841, N26.287161), in 2019. In the first growing season (from April to August) and the second growing season (from July to October), 448 accessions (Population 1) and 447 accessions (Population 2) from the panel were planted, respectively, with 385 accessions common in the two seasons (Supplementary  Table S3). In both seasons, rice seeds were sown after pre-germination and 10 seedlings per accession were transplanted 25 d after sowing in the paddy field, with a distance of 20 cm between rows and between seedlings. Field management followed the regular procedure.

BLS Resistance Assessment and Broad Sense Heritability
Pathogen inoculation was conducted at the active tillering stage of rice with a highly virulent Xoc strain kindly provided by Prof. Guoying Cheng of Huazhong Agricultural University, using the pricking inoculation approach [2,30]. Five leaves with similar age were inoculated in each plant and three of them were randomly selected for measuring the longitudinal length of lesion on the leaf with a ruler 18 d after inoculation. The average lesion length of three leaves was calculated for each plant, and the average lesion length of 5 plants was calculated to indicate the BLS resistance of each accession. The broad sense (mean-based) heritability analysis from two-way ANOVA was conducted using the following equation: h 2 = σG 2 /[σG 2 + (σGE 2 /e) + (σe 2 /re)] where σG 2 (genetic variance), σGE 2 (genotype-environment interaction variance) and σe 2 (variance of error) were measured and normalized using e (number of environment) and r (number of replicates) [50]. R [51] was used in the statistical analysis including two-way ANOVA and broad sense heritability with its native packages. The significant level of the assessed traits was displayed using R package car (type II Wald chi-square tests) [51].

Multi-Locus GWAS
The SNP data of 510 rice accessions were obtained from the 3K Rice Genomes Project database [52]. A subset of 140345 SNPs meeting the requirement of minor allele frequency (MAF) 5% and missing data ratio < 0.1 were selected from the core genome set of 404K SNPs [53] for population and association analyses. The population structure was investigated using the method of PCA (principal component analysis) plots and the program fastSTRUCTURE [54] (http://rajanil.github.io/fastStructure/). LD decay for the whole population was measured by correlation coefficients (R2) for all pairs of SNPs within 1000 Kb using the tool of plink [55].

Comparison of Phenotypic Differences Corresponding to QTNs and Identification of Candidate Genes
The phenotypic value for each accession was collected based on the SNP information as that of reference allele and target allele, respectively. A t-test was conducted to assess the phenotypic differences between the two groups. To obtain reliable candidate genes, the QTNs with the top five LOD scores and top five values of additive effect were annotated based on the Rice Genome Annotation Project (http://rice.plantbiology.msu.edu/). The interval of half LD attenuation distance from the upstream to the downstream of a QTN was empirically defined as the region of the corresponding QTL.

Conclusions
With six different multi-locus GWAS methods (mrMLM, ISIS EM-BLASSO, pLARmEB, FASTmrMLM, FASTmrEMMA and pKWmEB), a total of 79 QTNs for BLS resistance were identified, these QTNs were not evenly distributed on rice genome. Each QTN had an additive effect of 0.20 (cm) on average and explained the cumulative degree of phenotypic variation. The novel QTNs identified in this study can be used to improve rice variety resistance to BLS through marker-assisted study.