Genome-Wide Association Study for Resistance to Rhynchosporium in a Diverse Collection of Spring Barley Germplasm

: Rhynchosporium is one of the main biotic stresses on barley production worldwide. A set of 312 spring barley accessions was tested in four different locations over 3 years, to identify novel genetic resistances to rhynchosporium and to explore the allelic diversity for resistance genes present in this global germplasm collection. High-density genotypes from exome capture and RNA-seq were used to conduct high-resolution association mapping. Seven quantitative trait loci (QTL) were detected, including one in the Rrs2 region, amongst ﬁve containing known resistances. Relatively short physical intervals harbouring these resistances were proposed, providing a platform for the identiﬁcation of underlying genes and tightly linked genetic markers for use in marker assisted selection. Genes encoding kinases were present in four of the QTL, in addition to Rrs1 and Rrs18 , two loci known to contribute to rhynchosporium resistance. The frequencies and distributions of these novel and known QTL were superimposed on the regional origin of the landrace genotypes comprising the genome-wide association studies (GWAS) panel, highlighting the value of genetic resources as a source of diverse genetically controlled resistance to rhynchosporium. The detected QTL along with their linked genetic markers, could be exploited either directly for breeding purposes or for candidate gene identiﬁcation in future studies.


Introduction
Barley is the fourth most important cereal crop worldwide, by cultivated area, with 52 million hectares grown in 2020 [1]. One of its major diseases is rhynchosporium or scald, caused by the hemibiotrophic fungal pathogen Rhynchosporium commune [2]. Rhynchosporium develops in cold and wet conditions, with primary inoculum originating from infected seeds or straw and further spore dispersal through water splashes [3]. The pathogen is highly genetically variable and 58% of the global genetic diversity has been reported to be represented within individual fields [4]. In the United Kingdom, rhynchosporium is the most damaging disease of barley and is mainly managed by fungicides, with two sprays recommended for spring barley [5]. The number of effective fungicides available is decreasing, due to the withdrawal of products or loss of efficacy, meaning that resistant barley cultivars will be key for sustainable barley production [6,7].
A better understanding of genetic resistance to rhynchosporium is needed to improve barley cultivars and the sustainability of the crop. Indeed, very little is known about the actual genes responsible for resistance to rhynchosporium. The major resistance loci are collectively known as Rrs (for Reaction to Rhynchosporium secalis, the former name of the species) [8], but so far candidate genes have been published for only three Rrs genes. Receptor-like kinases were suggested as causative genes for Rrs1 Rh4 [9] and Rrs18 [10]. In contrast, Rrs2 has been fine mapped to a 0.08 cm interval containing a cluster of Pectin Esterase Inhibitor genes, but none of them individually trigger the Rrs2 resistance response [11,12]. A receptor kinase-like protein has also been suggested as a candidate gene for a quantitative trait locus (QTL) QSc.VR4 [13]. In total, 11 major resistance genes (Rrs genes) have been described, but numerous other QTL have been identified and a full review was conducted in 2020 [14]. The authors listed 148 QTL, many of which are mapped to genetically large or overlapping intervals. So far, truly diagnostic markers based on single nucleotide polymorphisms (SNPs) have only been published for Rrs1 Rh4 [9].
European spring barley germplasm is relatively susceptible to rhynchosporium, with a prior association mapping study identifying only one effective major resistance gene (Rrs1 Rh4 ) segregating in elite spring barley cultivars [15]. As the pathogen is highly diverse and can adapt very quickly to defeat a single resistance gene [16], the identification and characterization of novel sources of resistance is necessary to provide long-term sustainable protection. Therefore, the use of diverse germplasm, including landraces, will be a valuable source of untapped genetic variability for rhynchosporium resistance for use in barley breeding [17].
Genome-wide association studies (GWAS) are a powerful tool to map regions of the genome associated with traits in germplasm collections. Indeed, GWAS can access considerable diversity and numerous historic recombination events, which results in reduced linkage disequilibrium (LD), providing greater mapping resolution [18]. Recently this approach has been used in a number of studies to identify resistance loci in barley [15,19,20]. In European spring barley germplasm, the Rrs1 Rh4 locus was shown to be the main contributor to resistance, with a further 15 resistance QTL showing minor effects [9]. In a collection of Ethiopian, North American, and the International Center for Agricultural Research in the Dry Areas (ICARDA) barley lines, multiple resistance QTL were found with 17 marker-trait associations (MTA) in 16 genomic regions [19]. The value of Scottish bere barley landraces to mine for rhynchosporium resistance genes was also pointed out and eight genetic regions associated with resistance were identified [20]. These studies used three, five, and two field trials, respectively, for their GWAS analysis. A further eight QTL originating from wild barley accessions were identified by GWAS in the Nested Association Mapping population HEB-25 [21]. GWAS was also used to map nine rhynchosporium resistance regions, including three new regions, in four multi-parent advanced generation inter-cross (MAGIC) populations tested in northern Europe [22].
The aim of this study was to determine whether using a highly diverse legacy collection of barley, from across the geographical range of the species, combined with over 100,000 physically located SNPs, variable field trials, and state of the art genome-wide analyses, could lead to detection of novel resistance QTL and narrow down the physical intervals for the previously published resistance QTL. A further important aim was to identify resistant accessions that could provide novel sources of pre-breeding material for the breeding community. SNP markers associated with resistance QTL detected here will help to improve resistance against this dangerous barley pathogen.

Field Trials
Four sites were used for the field trials between the 2018 and 2020 growing seasons. Table 1 provides details of the locations and the number of field trials carried out at each site. All the sites were autumn-sown (end of November), except for Maule in 2018, where a spring sown (beginning of March) field trial was also conducted. Each field trial had a single replicate block design with 1 m 2 mini-plots, except for Dundee where the field had a double replicate randomised block design. Rhynchosporium spread from the naturally present inoculum from residual barley crop debris built up over several years of field trials [9,25] and, therefore, the pathogen population was not controlled and could be different in each field [3,4]. In order to increase the quantity of primary inoculum, the previous crop was always barley, and the Dundee site had overhead irrigation and spreader rows consisting of a mixture of susceptible winter barley cultivars.

Field Phenotyping and Data Preparation
The rhynchosporium phenotyping was carried out toward the end of the growing season, in May and June (post anthesis). Rhynchosporium levels were scored on a 1 to 9 scale provided in Table 2, where 1 represents no visible symptoms in the entire plot and 9, where all the leaves were dead due to rhynchosporium [25,26]. Phenotypes provided in Supplementary Table S1 represent the disease level across the canopy. Most of the field trials were scored once during the season, although Dundee 2020, Maule 2018 and Pithiviers 2018 were scored on two separate occasions to take the progress of the infection into account. In these cases, the average of the two scores was used for further analysis. For the fields with two replicates of each accession, the average of the two scores was used for the analysis. Leaves appear 1/2 infected 1/2 green 7 Leaves appear more infected than green 8 Very little green leaf tissue left 9 Leaves dead no green leaf left In addition to rhynchosporium, heading date and plant height were scored in subsets of five and six field trials respectively. Indeed, plant height, and earliness are known to be involved in disease escape from rhynchosporium [3] and information on these traits can help the identification of rhynchosporium resistance genes per se. The heading date was scored as the day when half of the ears in a plot were becoming visible. Heading date is a relatively reliable and simple method to score earliness, it corresponds to Zadoks stage Z55 [27]. Plant height was measured in cm from the soil to the bottom of the spike at the beginning of ripening (Zadoks stage Z91). The best linear unbiased estimators (BLUE) for plant height and heading date were calculated for all the accessions, with the lmer function from the lme4 R package [28].
A multiple regression was then applied to explain the rhynchosporium phenotypes for the 8 field trials depending on the BLUE-height and BLUE-heading date, following the model below: where i = field trial, j = genotype, a and b are coefficients, and ε ij are the residuals for the field trial i and the genotype j.
The residuals of this model were then used as the rhynchosporium phenotypes to analyse in the association studies of the eight field trials. The use of a regression to prepare a dataset to analyse is a common practice when two quantitative variables are linked and is commonly used in the case of grain protein content in wheat, with the residuals of the regression protein content depending on yield called the grain protein deviation [29].

Genotyping
The accessions from the WHEALBI collection have legacy exome capture sequence data available. The exome capture data were generated using Illumina short read sequencing of predicted coding areas of the genome. A description of this and the population structure and LD pattern is available from a previous study [23]. The WHEALBI exome capture reads were mapped onto the Morex 2017 assembly [30] using BWA-MEM [31]. A SNP calling was performed with the GATK [32] following GATK's best practices pipeline [33]. The raw variant calls were filtered with custom Java version 6 USA. code using the following criteria: 1.
More than 8× coverage for at least 50% of the samples (to ensure robust SNP and genotype calls); 2.
More than 95% of samples represented at SNP locus (for maximum sample representation); 3.
Minor allele frequency of 1% (to exclude SNPs based on very rare alleles); 4.
Quality score: less than 1/1000 chance of the SNP having been called by error (for SNP robustness).
All the other accessions were RNA-sequenced (RNA-seq) except for Atlas46, SLB-22-014 and SLB-34-076 accessions for which whole genome shotgun (WGS) sequencing was carried out (>30× coverage). The RNA and genomic DNA extraction was done using the second leaf from 3-4 weeks old seedling using the Qiagen RNeasy Plant mini kit and DNeasy kit, following the manufacturer's instructions. DNA and RNA integrity was tested by gel electrophoresis. DNA and RNA yield was measured using PicoGreen (Turner BioSystems Inc., Sunnyvale, CA, USA). RNA samples were DNase I treated using the Ambion Turbo DNA-free kit (Fisher Scientific UK Ltd., Loughborough, UK), following the manufacturer's protocol. All the sequencing was done using Illumina paired-end sequencing. The RNA-seq gave on average 63 million reads (from 1.7 to 319 million reads depending on accession). The reads were mapped onto the Morex 2017 genome assembly [30] using STAR [34]. Both the RNA-seq and WGS data were used together for the SNP calling with the GATK. SNPs with less than 50% missing data in this dataset were selected to be merged with the exome capture dataset.
Hereafter, the data analysis was carried out using R 3.6.1 'Action of the toes' [35].
The diagnostic SNP chr3H_490244130 identified by Looseley et al. [9] was used to determine the list of accessions containing Rrs1 Rh4 .
The merging of the different SNP datasets resulted in 115,172 SNPs (Supplementary  Table S2). An imputation of the missing data was necessary to avoid the clustering of the accessions depending on the genotyping method used. The imputation was done with the knnimpute function from the R package scrime [36].

GWAS Using Single-Locus and Multi-Locus Models
The association mapping was carried out for each phenotyping dataset using three different methods: one single-locus GWAS method (EMMAX) and two multi-locus methods, where the SNPs were not tested alone, but in combinations to take into account epistasis. First, the GWAS was performed using an efficient mixed model association expedited, EMMAX, implemented by the rrBLUP R package [37]. In this model, the population structure is estimated in a first step and used as fixed effect in the mixed linear model [38]. The two multi-locus GWAS methods used were FASTmrMLM and BLINK. The multi-locus GWAS methods were designed to have higher statistical power and lower false positive results for multigenic traits [39]. FASTmrMLM was implemented with the mrMLM R package [40]. This method works in two stages, first the whole genome is scanned to detect the most significant markers, and in a second step, a few SNPs are selected and tested together in the multi locus model. The second multi-locus model used in this study was BLINK: Bayesian-information and Linkage Disequilibrium Iteratively Nested Keyway [41]. BLINK was implemented using the BLINK-R package [42]. Single-locus and multi-locus GWAS models are considered to be complementary depending on genetic architecture and population structure within the germplasm collection [43,44].
For all the GWAS methods, the SNPs detected were considered as having a significant impact on rhynchosporium scores when −log 10 P > 5. This significance threshold is relatively high and was set to avoid false positives in a context where GWAS for the same trait was run with three different methods potentially generating a high number of MTAs. The population structure was assessed with a cluster analysis based on Principal Component Analysis (PCA), and sub-populations were determined using the R 'kmeans' function. The LD decay was then calculated around each SNP detected in a GWAS, taking into account the population structure, using the LDcorSV package [45], as done by Bustos-Korts et al. [23]. The LD was averaged using a sliding window of 10 markers to estimate the local LD decay. All the SNPs detected with the different methods have been gathered in a single dataset. The SNPs with overlapping intervals for a LD decay of R 2 = 0.2 were considered as detecting the same QTL, as recommended by Alqudah et al. [18]. QTL were considered as consistent and kept in the final results if they were detected within at least two field trials.

Genotyping
The cluster analysis of the genotyping datasets composed of 115 172 SNPs per accession identified six groups within the landrace collection (Figure 1). Cluster 1 corresponds mainly to six rowed landraces of European origin. Cluster 2 is composed of Ethiopian landraces. Cluster 3 is mainly Asian landraces, from the Middle East to India and Korea. Cluster 4 is mainly European two-rowed landraces. Cluster 5 corresponds to six-rowed landraces from the Mediterranean region, and finally cluster 6 is mainly two-rowed landraces from the Middle East. The fact that barley accessions were clustered by their region of origin independent of the genotyping method used confirms the successful merger of the different SNP calling datasets. The diagnostic SNP for Rrs1 Rh4 [9] was used to identify a list of 32 accessions containing this major resistance gene to rhynchosporium (Supplementary Table S1). This list includes Belgravia, Retriever, and 13 Syrian/Jordanian landraces previously shown to contain Rrs1 Rh4 [9] as well as 1 more Syrian landrace, 3 landraces from Vavilov Institute collection, and 13 landraces from the WHEALBI collection. The majority of barley landraces containing Rrs1 Rh4 originated in the Mediterranean region (Supplementary Table S1).

Phenotyping
The rhynchosporium scores showed a high level of variability between genotypes. To ensure rhynchosporium infection most field trials were autumn sown as barley sown in autumn in the locations used for field trials was consistently infected with R. commune. This is mainly due to relatively mild and wet conditions in winter and early spring in the UK and France which favour R. commune development. Rhynchosporium infection occurred in each field trial, although the levels of disease severity and distribution patterns were different (Figure 2) Table S1). The shape of the distribution of scores varied for each field trial, reflecting different weather patterns, inoculum or other environmental conditions. The collection used in this study was also highly diverse in relation to plant height, as it contains both tall and dwarf accessions, and earliness, reflected by the heading date scores. The BLUE heading date and plant height showed Gaussian distribution of the landraces, with an average heading date on 20 May within an 18-day range, and an average plant height of 64 cm within a range of 31 to 94 cm ( Figure 3, Supplementary Table S1).  (Figure 4). By comparison, the correlation between replicates for the fields with two replicates, Dundee 2019 and Dundee 2020, was 0.79 and 0.62, respectively (this difference between years can be explained by a higher level of infection at the end of the season in Dundee in 2019 compared with 2020, resulting in a more reliable phenotyping). The correlation coefficients between rhynchosporium scores for the remaining trials were moderate with the lowest correlation between Maule 2018 spring sowing and all the autumn sown trials. Heading date and plant height were relatively highly correlated (0.5) and significantly negatively correlated with the rhynchosporium scores for all the field trials, except Maule 2018 spring sowing (Figure 4). This is in agreement with plant height and earliness being linked to disease escape to rhynchosporium [3]. One of the resistance QTL identified by Looseley et al. [15] co-localised with the well-characterised semi-dwarfing gene sdw1 [46], which was previously shown to contribute to disease escape [25].

Genome-Wide Association Studies
The GWAS with the three methods EMMAX, BLINK, and FASTmrMLM resulted in seven QTL detected in at least two field trials. These QTL are located on chromosomes 3H, 5H, and 7H, with one, two, and four QTL detected on these chromosomes, respectively (Table 2 and Supplementary Figures S1-S16). All the QTL detected on chromosomes 3H and 7H correspond to genomic regions with published QTL for resistances to rhynchosporium. The QTL-7H-1 was detected in three field trials ( Table 2) and was located within the physical interval for Rrs2 [11]. The QTL-7H-1 allele associated with resistance was the minor allele in the collection, with a frequency of 0.22. The QTL-7H-2 is a QTL with the highest P-value and colocalises with QTL mapped in an Australian cultivar [47]. QTL-3H-1, QTL-7H-3, and QTL-7H-4 are also in areas where QTL were previously identified based on bi-parental population studies [10,19,22,[48][49][50][51]. However, on chromosome 5H, QTL-5H-1 and QTL-5H-2 were detected in regions with no known resistances to rhynchosporium. Unlike chromosomes 2H, 3H, 4H, 6H, and 7H, chromosome 5H contain only five other rhynchosporium resistance QTL identified so far: Qsc5H-Shyri (0.9 Mb) [52], qSUK7_5 (456-526 Mb) [10], QTLCW5H.1 (555.7 Mb) [26], Qsc5H.3 (587.4 Mb) [19], and Qsc_5H_1 (638.2-659.5 Mb) [22]. The first two QTL on chromosome 7H are the only detected loci where the allele associated with resistance is the minor allele, the other putative resistances appear more widespread with frequencies of 0.64 to 0.91 in the landrace collection (Table 3).  This study used the highest number of markers for GWAS for resistance to rhynchosporium so far, 115.172 SNPs compared to under 9000 markers used by Looseley et al. [15], 23.549 SNPs used by Daba et al. [19], and 37.242 makes used by Cope et al. [20]. This resulted in identification of relatively narrow QTL intervals with the majority of them being under 1 Mb ( Table 2). The smallest QTL interval is less than 0.01 Mb for QTL-7H-1 within the Rrs2 interval. Only QTL-7H-3 interval stretches to 80 Mb, while QTL-5H-2 is 29 Mb. The intervals contain on average 102 genes, ranging from 1 gene for QTL-5H-1 and QTL-7H-4 to 512 genes for QTL-7H-3 ( Table 2). The list of the genes present in each interval in the Morex 2017 genome annotation [30] is provided in Supplementary Table S3.
Another putative candidate gene for rhynchosporium resistance within QTL-3H-1 is a germin-like protein (GLP). GLPs were previously found within another rhynchosporium resistance QTL [13]. GLPs were shown to be involved in basal host resistance against several fungal pathogens of cereals in barley [59,60], wheat [60] and rice [61].
It is not clear which genes could be putative candidate genes for QTL-7H-1 overlapping with Rrs2 and QTL-7H-4 overlapping with Rrs15 as the genome annotation for these regions does not contain any kinases. QTL-7H-2 overlapping with QTLSR-7H-2017 [47] is likely to contain a quantitative resistance gene that does not have to be a RLK or RLP. While there are only three annotated genes within this interval, none of them represent an obvious candidate for a quantitative resistance gene. Considering that the genome annotation is based on the North-American cultivar Morex genome assembly, which is not expected to contain many of the quantitative resistance genes or any R genes to rhynchosporium, and might not even have alleles of these genes, the gene content of these QTL would have to be examined in the genotypes containing these resistances. GWAS ability to detect MTAs relies on the allele frequency at QTL and, therefore, it is likely that rare resistance genes were missed during this analysis. Despite the presence of diagnostic SNP for Rrs1 Rh4 in 10% of the accessions used in this study, only a weak effect of this resistance, with −log 10 P reaching a value above the significance threshold in only one of the field trials (p-value = 3.15 × 10 −10 for Pithiviers 2018 with BLINK), was observed (Supplementary Figure S5). Previously Rrs1 Rh4 locus was shown to be the main contributor to resistance in European spring barley germplasm [15]. Interestingly, the disease nursery trials analysed in that study took place at the same field in Dundee, UK between 2013 and 2015 but using a different germplasm collection. This may indicate a loss of Rrs1 Rh4 efficiency in this field following the growth of barley accessions containing this resistance gene over several years. The relatively low effect of the Rrs1 Rh4 in field trials might also be a consequence of the wide use of this resistance gene in commercially grown cultivars [15].

Geographical Distribution of Resistance Alleles
The countries of origin of the landraces were combined into wider geographical areas to study the regional frequency of the resistance QTL. The regions attributed to each accession are presented in the Supplementary Table S1. The Middle East region provided the highest number of genotypes, with 108 accessions, followed by Europe (87 accessions), Asia (52 accessions), North Africa (29 accessions), Ethiopia (18 accessions), North America (10 accessions), and South America (8 accessions). The frequencies of the alleles associated with resistance for each region are presented in Table 4.
The Ethiopian landraces carried the highest number of resistance QTL (the allele associated with resistance is the major allele for all QTL except in the case of QTL-7H-1, for which all the Ethiopian landraces possess an allele associated with susceptibility) while the Asian landraces showed the lowest number. The landraces from Asia and Europe had frequencies for alleles associated with resistance below 50% for 4 and 3 QTL located on chromosome 7H respectively. The landraces from the Middle East carried all the alleles associated with resistance at relatively high proportions (39% minimum). The allele frequencies for the landraces from North Africa, North America, and South America are based on very small numbers of accessions but still reflect the global trend of QTL-7H-1 and QTL-7H-2 having the lowest frequencies for alleles associated with resistance. Geographical distribution of alleles associated with resistance in the global barley landrace collection used in this study clearly demonstrated the presence of resistance to rhynchosporium in barley accessions from the primary (Fertile Crescent) and secondary (Ethiopia) centres of barley diversity. This further supports the suggestion that at least some resistance to rhynchosporium has evolved in the Fertile Crescent [9] despite the theory of R. commune originating in Northern Europe [4]. Therefore, global barley accessions, including those from the Fertile Crescent and Ethiopia, represent a valuable source of resistance to rhynchosporium yet to be fully utilised by European breeders.

Conclusions
Association mapping is a powerful tool for detection of genomic regions associated with a trait of interest. Seven QTL were detected in at least two field trials in this study. The use of historical recombination events in germplasm collection allows mapping of traits to relatively small intervals and exploring the diversity of QTL available in a wide germplasm collection. However, the QTL detected by GWAS need to be confirmed by other methods. In our case, five of the QTL detected were previously described in biparental populations, and the importance of our study is in identification of relatively narrow intervals for these QTL. Our results provide insight into potential candidate genes annotated in these intervals, though the genetic content of these QTL should be investigated in cultivars where their presence have been confirmed. Further work is still needed to clone the resistance genes underpinning these QTL, but our study provides insight into the location of rhynchosporium resistance QTL in a spring barley collection representing global diversity and into the geographical distribution of the resistances to rhynchosporium. Achieving long-lasting protection against rhynchosporium would benefit from combining a wide range of complementary resistances. This requires further efforts in identification of diagnostic markers to make marker-assisted breeding for resistance to rhynchosporium a reality. The QTL identified in this study and the accessions containing alleles associated with resistance further expand the resources available for breeding varieties required for sustainable barley production.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/agronomy12040782/s1, Figures S1-S16: Manhattan plots of the GWAS with the BLINK and EMMAX methods (FASTmrMLM do not produce Manhattan plots for its final results); Table S1: Details of the barley accessions used in the study, and phenotypes scored in the field trials; Table S2: Genotyping matrix containing alleles at the 115,172 SNPs used in the GWAS; Table S3: Gene content of the QTL detected in the GWAS, based on the Morex V1 genome annotation.