Genome-Wide Association Study (GWAS) for resistance to Sclerotinia sclerotiorum in Common Bean

White mold (WM) is a devastating fungal disease affecting common bean (Phaseolus vulgaris L.). In this research, a genome-wide association study (GWAS) for WM resistance was conducted using 294 lines of the Spanish diversity panel. One single-locus method and six multi-locus methods were used in the GWAS. Response to this fungus showed a continuous distribution, and 28 lines were identified as potential resistance sources, including lines of Andean and Mesoamerican origin, as well as intermediate lines between the two gene pools. Twenty-two significant associations were identified, which were organized into 15 quantitative trait intervals (QTIs) located on chromosomes Pv01, Pv02, Pv03, Pv04, Pv08, and Pv09. Seven of these QTIs were identified for the first time, whereas eight corresponded to chromosome regions previously identified in the WM resistance. In all, 468 genes were annotated in these regions, 61 of which were proposed potential candidate genes for WM resistance, based on their function related to the three main defense stages on the host: recognition (22), signal transduction (8), and defense response (31). Results obtained from this work will contribute to a better understanding of the complex quantitative resistance to WM in common bean and reveal information of significance for future breeding programs.


Introduction
Common bean, Phaseolus vulgaris L., is the most important grain legume for human consumption worldwide [1] and can be harvested fresh as snap beans (pods harvested before the seed development phase) or as dry beans (seeds harvested at complete maturity). It is a diploid species (2n = 2x = 22) native to the Americas, in which two major eco-geographically and genetically distinct gene pools, the Andean (AN) and the Mesoamerican (MA) have been described [2][3][4][5][6]. Common bean was introduced into Europe from the Americas in the 16th century, and since then, a considerable diversity of this crop has been traditionally grown in many Spanish regions. Part of this traditional Spanish diversity has been gathered in the Spanish Diversity Panel SDP [7].
Diseases are one of the main constraint factors in common bean production. Among them, white mold is responsible for important losses of bean yield production on a global scale. The causal agent of white mold in the common bean is the fungus Sclerotinia sclerotiorum (Lib.) de Bary, which is capable of infecting over 400 plant species worldwide [8,9]. Symptoms of the disease are easily identified with a white cottony appearance caused by the fungal mycelium on stems, leaves, flowers, or pods. Resting structures called sclerotia are produced during the infection process as hardened, dense mycelial bodies, that can survive in the soil for several years, increasing the incidence of the According to Pascual et al. [36], lines Cornell49242 and AB136 were used as checks of susceptibility and high resistance responses, respectively.
Genotyping-by-sequencing GBS, [37] was carried out at BGI-Tech (Copenhagen, Denmark) using the ApeKI restriction enzyme. The sequencing reads obtained were aligned using the Phaseolus vulgaris L. genome v1 (www.ncbi.nlm.nih.gov; [24]) and a total of 9070 single-nucleotide polymorphism (SNPs) were mapped [7]. Data were filtered in Tassel v 5.2 software [38] for missing values (<10%) and minor allele frequencies (MAF ≥ 0.05), being considered for GWAS 5394 SNPs distributed across the eleven bean chromosomes ( Figure S1). Each SNP was named to indicate the chromosome onto which it was mapped, followed by its physical location in base pairs.

White Mold Response Evaluation
Resistance tests were conducted using the straw method under controlled greenhouse conditions [39]. All lines were evaluated in two separate experiments (E1: sowing day 17 April 2018; E2: sowing date 26 April 2018). Each experiment consisted of two replicates (pots) per line, with 4-5 plants per pot, arranged in a randomized complete block design. Seeds were sown in 15 cm diameter plastic pots of 1.5 L volume containing 80% peats: 20% perlite and were watered and fertilized for normal plant growth. Each plant was inoculated when the third internode of the main stem was fully developed (21-28 days after planting).
The local S. sclerotiorum isolate, WM2, derived from a single hyphal tip and conserved in the form of sclerotia at −20 • C, was used for the resistance test [36]. One sclerotium was germinated on potato dextrose agar medium (Difco, Becton, Dickinson and Company, Spark, MD, USA) in a standard Petri plate and then subcultured to prepare plates of fresh and actively growing mycelium for the inoculations. A single mycelial plug was placed, mycelium-side down, on the cut main stem. The inoculated plants were maintained at moderate temperature (16 • C to 24 • C) in the shade and at a high relative humidity (70%). Disease progression was evaluated 8-10 days after inoculation (when the susceptible control was dead), based on the level of invasion of the main stem, using a 1-9 severity scale, where 1 = no symptoms and 9 = invasion of the third node and total plant collapse [40]. Values equal to or less than 4.5 were considered to be a resistant reaction (R), values between 4.5 and 7 were considered to be an intermediate reaction (I), and values equal to or greater than 7 were considered to be susceptible (S). Each plant in the pot was rated separately. The disease value per pot was calculated as the arithmetic mean of the 4-5 plants. Mean values were adjusted identifying outliers through the coefficient of variation (CV = (standard deviation/mean) × 100). Coefficient of variation over 50% was not accepted. In each evaluation, E1 and E2, the disease value per genotype was calculated as the average of the two replicates (pots). E1, E2, and the arithmetic mean of E1 and E2 (Em) were considered to be three sets of data.

Data Analysis
Statistical analyses of the genotypic and phenotypic data were conducted with the R Project for Statistical Computing [41]. The distribution of the SNPs along the eleven chromosomes was visualized with the "CMplot" package in R. Spearman's correlation coefficient was calculated to estimate the association between results from the independent inoculations, E1 and E2.
Broad-sense heritability (H 2 ) for the trait "WM response" within the 294 SDP lines was estimated using the function repeatability() of the "heritability" package in R. H 2 was estimated at plot level in the E1 and E2 evaluations following the equation V g /(V g + V e ), where V g = (MS(G) − MS(E)/r and V e = MS(E), r is the number of replicates per genotype, and MS(G) an MS(E) are the mean sums of squares for genotype and residual error obtained from the analysis of variance. H 2 was estimated at the genotype level in the Em data set following the equation V g /(V g + V e/ r).

Genome-Wide Association Study
To dissect the genetic architecture of a trait as complex as white mold resistance, two approaches were used in this study: (i) consistency between analysis in different data sets (E1, E2, and Em) and (ii) multiple GWAS methodologies and models.
Two different GWAS methodologies were considered: single-locus GWAS (SL-GWAS) based on MLM [26,27] and multi-locus GWAS (ML-GWAS) based on six models: mrMLM, FASTmrMLM, FASTmrEMMA, ISIS EM-BLASSO, pKWmEB, and pLARmEB [28−33]. SL-GWAS was conducted in Tassel v 5.2 [38], while ML-GWAS was performed using the "mrMLM" package [42] in R. To account for the multiple levels of relatedness within the lines included in the panel, population structure and kinship matrix were considered. Population structure was measured by Principal Component Analysis (PCA) conducted on the genotypic data set, considering two principal components according to the results of Campa et al. [7]. The centered-IBS method was used to obtain the kinship matrix, which is an estimate of the additive genetic variance. Both, PCA and kinship matrices were obtained in Tassel v5.2 [7] and used in all the GWAS models. A threshold of −log10(p) equal to or greater than 3.0 was used to identify significant associations between a trait and an SNP.
Three data sets (E1, E2, and Em) were evaluated for each one of the seven GWAS methods (MLM, mrMLM, FASTmrMLM, FASTmrEMMA, ISIS EM-BLASSO, pKWmEB, and pLARmEB). For each method, only significant associations identified in the mean data set (Em) and in, at least, one experimental environment (E1, E2) were considered. Each quantitative trait nucleotide (QTN) was named based on the chromosome and physical position in Mbp of the associated SNP (i.e., the SNP marker located at 42.46 Mbp of chromosome Pv01 would result in the WM1_42.46).
To verify the robustness of these QTNs, T-student tests were conducted to detect significant differences for the response to WM between the two alleles of the corresponding SNP. The observed variation was visualized in a boxplot.

Quantitative Trait Intervals
Quantitative trait intervals (QTIs) were defined as a group of QTNs located at less than 100-kbp upstream and downstream [43] from the significant SNP. In some cases, the physical positions of different QTIs overlapped or were situated very close to one another. Physical positions less than 400-kbp apart were considered to be overlapping QTIs. QTIs were named using the nomenclature chromosome and the lowest physical position in Mbp of the QTI (i.e., if the QTI expands from 42.46 to 43.02 Mbp of chromosome Pv01, it would be named as QTI1_42).
The "ShinyCircos" package [44] in R was used to visualize the position of each QTI and meta-QTLs/QTLs/QTNs previously reported in the bean genome from the underlying markers [23,34].

Candidate Gene Identification
Genes underlying each QTI were analyzed using the legume information system (https: //legumeinfo.org/). Potential candidate genes were considered, based on their function and according to the three main stages proposed in the defense process against S. sclerotiorum: recognition, signal transduction, and defense response [11]. Genes coding for pathogenesis-related proteins or stress-antifungal proteins were also considered as potential candidate genes.

White Mold Evaluation
A total of 294 common bean lines included in the SDP were evaluated against the local WM isolate in two independent evaluations, E1 and E2. E1 and E2 were significantly positively correlated (R = 0.68, p < 0.01; Figure S2). Figure 1 shows the distribution of the reaction to inoculation observed over the 1 to 9 disease severity scale. A wide and continuous distribution of disease reactions was Most lines showed an intermediate or susceptible reaction on the disease severity scale. A total of 28 lines showed values less than 4.5 and were potential resistance sources against WM (Table 1). According to Campa et al. [7], seven of these lines came from the Andean gene pool of common bean, seven from the Mesoamerican, and 14 were recombinants between the two gene pools. Most of these potential resistance sources exhibited an indeterminate growth habit [7]. Most lines showed an intermediate or susceptible reaction on the disease severity scale. A total of 28 lines showed values less than 4.5 and were potential resistance sources against WM (Table 1). According to Campa et al. [7], seven of these lines came from the Andean gene pool of common bean, seven from the Mesoamerican, and 14 were recombinants between the two gene pools. Most of these potential resistance sources exhibited an indeterminate growth habit [7].

Genome-Wide Association Study
For SL-GWAS, 55 significant WM-SNP associations were identified, using the three WM data sets independently (Table S1A): 15 WM-SNP associations were identified for E1, 23 for E2, and 17 for Em. Of these associations, 13 were considered to be robust QTNs located on chromosomes Pv01, Pv02, Pv03, Pv04, Pv08, and Pv09 (Table 2).  For ML-GWAS, 147 significant WM-SNP associations were identified in the three WM data sets (Table S1B). The mrMLM method reveals 34 significant associations, of which only three were consistent QTNs located on chromosomes Pv01, Pv02, and Pv09 (Table 2). For the FASTmrMLM method, 20 significant associations were identified, three of which were considered to be QTNs located on chromosomes Pv01, Pv03, and Pv08. For the FASTmrEMMA, 33 significant associations resulted in the identification of three QTNs on chromosomes Pv01 and Pv08. For the ISIS EM-BLASSO, 22 significant associations resulted in four QTNs on chromosomes Pv01, Pv04, and Pv08. For the pKWmEB, 27 significant associations resulted in three QTNs on chromosomes Pv04 and Pv08. Finally, for the pLARmEB 33 significant associations were identified, giving rise to eight QTNs located on Pv01, Pv02, Pv03, Pv04, Pv08, and Pv09.
In sum, 22 robust QTNs, located on chromosomes Pv01, Pv02, Pv03, Pv04, Pv08, and Pv09, were obtained from SL-GWAS and ML-GWAS and the three data sets considered (Table 2). Among them, five QTNs were identified using more than one GWAS method. Special mention should be made for WM1_45.59 and WM8_52.39, which were identified by seven and six GWAS methods, respectively. To verify the robustness of these 22 QTNs, Student's t-tests were carried out for the mean WM evaluation (Figure 2). Significant differences were observed in all cases.
WM1_45.59 and WM8_52.39, which were identified by seven and six GWAS methods, respectively. To verify the robustness of these 22 QTNs, Student's t-tests were carried out for the mean WM evaluation (Figure 2). Significant differences were observed in all cases. Box plot for the mean WM evaluation plotted according to the haplotype for the 22 SNPs associated with WM response. The X-axis represents the two alleles for each SNP, while the Y-axis corresponds to the phenotype (1 to 9 disease severity scale). T, T-student test value; ** p < 0.01; *** p < 0.001. Figure 2. Box plot for the mean WM evaluation plotted according to the haplotype for the 22 SNPs associated with WM response. The X-axis represents the two alleles for each SNP, while the Y-axis corresponds to the phenotype (1 to 9 disease severity scale). T, T-student test value; ** p < 0.01; *** p < 0.001.

Quantitative Trait Intervals
The 22 QTNs identified were grouped into 15 QTIs (Table 3). Three QTIs were located in a telomeric position of chromosome Pv01, two QTIs were on Pv02, two on Pv03, one on Pv04, four on Pv08, and three on Pv09. The physical positions of these QTIs were compared with those of WM resistance loci previously reported in common bean. Figure 3 shows the overlapping between chromosome regions identified in this work and the QTLs, meta-QTLs, or QTNs identified in previous studies [23,34]. In all, seven new QTIs were identified in the current study as regions involved in the response against WM: QTI1_42, QTI2_34, QTI3_0.9, QTI4_2, QTI8_52, QTI8_55, and QTI9_36. The remaining eight QTIs identified corresponded to regions previously associated with WM resistance: QTI1_45, QTI1_50, QTI2_4, QTI3_39, QTI8_13, QTI8_37, QTI9_11, and QTI9_16. Table 3. QTIs identified in the SDP for resistance to WM. The physical position of each QTI is indicated (chromosome, start, and end position in Mbp). The number of annotated genes and candidate genes underlying each QTI is indicated. Potential candidate genes were grouped in the main three stages described for WM response: recognition, signaling, and defense. The 22 QTNs identified were grouped into 15 QTIs (Table 3). Three QTIs were located in a telomeric position of chromosome Pv01, two QTIs were on Pv02, two on Pv03, one on Pv04, four on Pv08, and three on Pv09. The physical positions of these QTIs were compared with those of WM resistance loci previously reported in common bean. Figure 3 shows the overlapping between chromosome regions identified in this work and the QTLs, meta-QTLs, or QTNs identified in previous studies [23,34]. In all, seven new QTIs were identified in the current study as regions involved in the response against WM: QTI1_42, QTI2_34, QTI3_0.9, QTI4_2, QTI8_52, QTI8_55, and QTI9_36. The remaining eight QTIs identified corresponded to regions previously associated with WM resistance: QTI1_45, QTI1_50, QTI2_4, QTI3_39, QTI8_13, QTI8_37, QTI9_11, and QTI9_16.

Candidate Genes
Genes underlying each QTI (spanning 5.5 Mbp) were investigated to identify potential candidate genes involved in the WM response. There were 468 genes annotated, 61 of which were identified as potential candidate genes based on their function (Table 3; Table S2). Twenty-two proteins were related to the recognition of the pathogen, eight were related to signal transduction, and 31 were related to defense response. No candidate genes were identified at QTI2_34, QTI8_13, and QTI8_37.

Discussion
The use of diversity panels has notably increased since the development of advanced genomic sequencing technologies and the availability of annotated reference genome sequences. In the current work, the resistance response against a local isolate of S. sclerotiorum was studied through GWAS, using the common bean SDP. The SDP was established from local Spanish germplasm, including the Spanish common bean core collection [35], and is considered to be representative of the Spanish genetic diversity for this species [7]. Using the straw test evaluation under greenhouse conditions, a moderate broad-sense heritability was observed for WM resistance. Heritability estimated for WM resistance in previous studies was also low to moderate [12,13,45,46], indicating the large environmental effect in the expression of this resistance.
A total of 28 SDP lines were identified as potential resistance sources against WM, most of them having indeterminate growth habit. Nine of these lines were obtained from the commercial snap varieties Sacha, Bilma, Helda, Planeta, Donna, Vitalis, Florencia, Marconi, and Tendergreen [7] so it could not be excluded that these resistances derived from different breeding programs in commercial varieties. The remaining 19 resistance sources were obtained from local Spanish landraces, which were assigned to different gene pools of common beans: six of them were genetically close to the Andean gene pool, six to the Mesoamerican gene pool and seven were recombinants between the two gene pools [7]. Most of the WM resistance sources identified to date are of Andean origin or from the secondary gene pool such as Phaseolus coccineus [23] so that the identification of six putative new resistance sources from the Mesoamerican gene pool is a significant finding.
Lines SDP035 and SDP071, identified as resistant in the current study, are derived from the Spanish landraces BGE003121 and BGE04000, respectively, and are both included in the Spanish common bean core collection [7,35]. This result agrees with that of Pascual et al. [36] who identified these Spanish landraces as potential sources of resistance against WM. The accession BGE003121 was identified as being of Andean origin based on 12 molecular markers, including the seed protein phaseolin [35]. Later, the use of 3099 SNP markers, obtained by mass genotyping, led to the identification of the line SDP035, derived from accession BGE003121, as a recombinant between the two gene pools, Andean and Mesoamerican [7]. The accession BGE04000 was identified as being of Andean origin, based on 12 molecular markers [35] and 3099 SNP markers [7].
In the present study, two approaches were used to improve the detection power and robustness of the GWAS: (i) analyses in multiple data sets and (ii) and multiple GWAS methodologies, based on both SL-GWAS and ML-GWAS. The control of false positives is very crucial in GWAS, but the effect of false negatives should not be ignored, which can occur if the cutoff value is too stringent. The low heritability observed for white mold resistance has led us to consider the less restrictive threshold value of −log(p) = 3. With these considerations, 22 WM QTNs were identified from the SDP. These QTNs were grouped into 15 QTIs: 3 QTIs on chromosome Pv01, 2 on Pv02, 2 on Pv03, 1 on Pv04, 4 on Pv08, and 3 on Pv09. No significant associations were identified on chromosomes Pv05, Pv06, or Pv07, on which other authors had described white mold QTLs [12,13,16,[18][19][20][21]23]. It is important to note that different WM evaluation methods were used by different authors, so the results from different studies cannot always be compared. The seedling straw test under controlled conditions [39], as used in the present work, is the most frequently used method, but field evaluations [16,18,21,22] or a combination of both, controlled conditions and field, [12,13,17,19,23,40] were also used. Under field evaluations, there is a greater environmental influence and it can be difficult to distinguish between physiological resistance or avoidance mechanisms, such as growth habit, early development, the arrangement of leaves, or internode lengths.
Seven of the 15 QTIs described in the present research were identified for the first time at chromosomes Pv01 (QTI1_42), Pv02 (QTI2_34), Pv03 (QTI3_0.9), Pv04 (QTI4_2), Pv08 (QTI8_52, QTI8_55), and Pv09 (QTI9_36). The QTI4_2 was located at 2.02-2.47 Mbp on chromosome Pv04. It is not surprising that this chromosome region is involved in the resistant response against WM because it corresponds to a well-known cluster of resistance genes encoding protein kinases or proteins with NBS-LRR domains, which are typically involved in resistance response against pathogens [24]. In fact, resistance genes against multiple fungal diseases have been mapped at this position [47][48][49]. One of these genes is Co-3, which is a cluster of closely linked genes that confers resistance against Colletotrichum lindemuthianum, the causal agent of anthracnose of common bean. Co-3 has been identified mainly in the Mesoamerican gene pool. This result could explain why Pv04 is identified in the WM response only in GWAS, where the panels include materials from the Mesoamerican gene pool, whereas most of the resistant sources used in bi-parental populations are of Andean origin [23].
The remaining eight QTIs identified in this work co-localized to WM QTLs, meta-QTLs or QTNs previously reported [23,34]: The  [34]. QTI1_45 could be related to morphological plant traits involved in pathogen avoidance response, because at this position (45.56 Mbp) the gene PHAVU_001G189200g has been proposed as a candidate of fin, the gene controlling growth habit with recessive genotypes showing determinacy [50,51]. The implication of this chromosome region in avoidance mechanisms have also been suggested by other authors [12,20].
Protein kinase and NBS-LRR genes constitute the largest plant disease resistance gene (R gene) family. Protein kinases are involved in defense mechanisms at different cellular levels, including elicitor recognition, as extracellular or intracellular receptors, in signal transduction, and the induction of transcriptional activation [54]. Furthermore, many NBS-LRR proteins recognize effectors secreted by pathogens directly or indirectly, which, in turn, activate downstream signaling pathways, leading to the activation of plant defense responses against various classes of pathogens, including bacterial, fungal, and viral, as well as nematode and insect pests [47,55]. Among the potential candidate genes identified, 28 encode for typical R genes products like protein kinases or NBS-LRR proteins and are located in all QTIs except for QTI8_55 and QTI9_36.
Genes Phvul.002G055700 and Phvul.002G055800, both ethylene-responsive transcription factors, have been proposed as candidate resistance genes against WM by Vasconcellos et al. [23]. Ethylene signaling has been shown to be involved in defense against S. sclerotiorum in Arabidopsis thaliana [56] and the closely related Brassica napus [57]. Both genes, Phvul.002G055700 and Phvul.002G055800, were identified in the present research at QTI2_4 and have been proposed as potential candidate genes against WM [23].
White mold produces many molecules and proteins, such as polygalacturonases (PGs), during infection and colonization to breach host cell walls [58]. Host plants produce polygalacturonaseinhibiting proteins (PGIPs) that recognize PGs and prevent enzymatic action during the invasion. In common bean, PGIPs consist of a complex locus of four genes organized in a cluster (PvPGIP1, PvPGIP2, PvPGIP3, and PvPGIP4) that spans a 50-kbp region at around 36 Mbp of chromosome Pv02 [59]. It has been demonstrated that this locus in the common bean is involved in defense against insects and fungi [59], including S. sclerotiorum infection [60,61].
Close to this PGIP cluster was located the QTI2_34 (34.54-34.84 Mbp), in which no candidate genes for WM resistance had previously been identified. It cannot be discarded that this cluster of PGIP genes may be the causal genes for WM response at this QTI. It is well known that SNPs may occasionally affect distant genes and the common practice of identifying candidate genes based on the nearest SNP association may lead to some difficulties in the identification of candidate genes [62,63].

Conclusions
The evaluation of the response to a local isolate of S. sclerotiorum in the SDP revealed 28 lines with high levels of resistance against this pathogen. Among these potential resistance sources, six were of Mesoamerican origin. This is a significant finding because most of the WM resistance sources identified to date are of Andean origin or from the secondary gene pool. The GWAS led to the validation of eight regions previously reported as being associated with WM resistance at chromosomes Pv01 (QTI1_45, QTI1_50), Pv02 (QTI2_4), Pv03 (QTI3_39), Pv08 (QTI8_13 and QTI8_37), and Pv09 (QTI9_11, QTI9_16). The GWAS also identified seven novel regions as being involved in the WM response located at chromosomes Pv01 (QTI1_42), Pv02 (QTI2_34), Pv03 (QTI3_0.9), Pv04 (QTI4_2), Pv08 (QTI8_52, QTI8_55), and Pv09 (QTI9_36). These results verify the complexity of this quantitative resistance, validate the implication of eight chromosome regions in WM response, and provide new targets that could be of major interest to bean breeders.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/11/12/1496/s1, Figure S1: Distribution along the eleven bean chromosomes of the 5384 single nucleotide polymorphisms (SNPs); Figure S2: Scatter plot obtained using the ggpubr package in R to display the relationship between the two variables WM evaluation 1 and WM evaluation 2. R, Spearman correlation coefficient; p, significance level; Table S1: (A) Significant SNP-WM response associations (−log10(p) ≥ 3) identified in the single-locus analysis conducted in Tassel. (B) Significant SNP-WM response associations (−log10(p) ≥ 3) identified in the multi-locus analysis conducted in mrMLM package of R software; Table S2: Genes underlying each QTI identified in this work. Potential candidate genes are indicated in red color. The potential implication of these candidate genes in recognition, signal transduction, or defense response is indicated.