Quantitative Trait Locus Mapping of Marsh Spot Disease Resistance in Cranberry Common Bean (Phaseolus vulgaris L.)

Common bean (Phaseolus vulgaris L.) is a food crop that is an important source of dietary proteins and carbohydrates. Marsh spot is a physiological disorder that diminishes seed quality in beans. Prior research suggested that this disease is likely caused by manganese (Mn) deficiency during seed development and that marsh spot resistance is controlled by at least four genes. In this study, genetic mapping was performed to identify quantitative trait loci (QTL) and the potential candidate genes associated with marsh spot resistance. All 138 recombinant inbred lines (RILs) from a bi-parental population were evaluated for marsh spot resistance during five years from 2015 to 2019 in sandy and heavy clay soils in Morden, Manitoba, Canada. The RILs were sequenced using a genotyping by sequencing approach. A total of 52,676 single nucleotide polymorphisms (SNPs) were identified and filtered to generate a high-quality set of 2066 SNPs for QTL mapping. A genetic map based on 1273 SNP markers distributed on 11 chromosomes and covering 1599 cm was constructed. A total of 12 stable and 4 environment-specific QTL were identified using additive effect models, and an additional two epistatic QTL interacting with two of the 16 QTL were identified using an epistasis model. Genome-wide scans of the candidate genes identified 13 metal transport-related candidate genes co-locating within six QTL regions. In particular, two QTL (QTL.3.1 and QTL.3.2) with the highest R2 values (21.8% and 24.5%, respectively) harbored several metal transport genes Phvul.003G086300, Phvul.003G092500, Phvul.003G104900, Phvul.003G099700, and Phvul.003G108900 in a large genomic region of 16.8–27.5 Mb on chromosome 3. These results advance the current understanding of the genetic mechanisms of marsh spot resistance in cranberry common bean and provide new genomic resources for use in genomics-assisted breeding and for candidate gene isolation and functional characterization.


Introduction
Common bean (Phaseolus vulgaris L., 2n = 2x = 22) is a widely grown grain legume crop planted in Canada with areas up to 160,000 ha and dry seed production up to 316,800 Mt [1]. As reported by the FAO (Food and Agriculture Organization), common bean global production reached 28.9 million tons within 33.1 million ha around the world in 2019. Over half of global production was shared by Asians [2]. It is not only a crucial crop for food security, but it is also highly nutritious, meeting human nutrition requirements for proteins, acids, such as nicotinamide (NA) or Phyto-siderophores [20,26]. NRAMP protein families are members of the major proteins implicated in Mn transportation from the root to the stem [27,28]. The CAX family mainly regulates the influx of cations into the vacuole. Its members are metal transporters that arbitrate the influx of cations into the vacuole [29,30].
Here, we report on the quantitative trait loci (QTL) associated with marsh spot resistance and on the putative candidate genes with a goal to assist in the development of diagnostic markers for marker-assisted breeding and to provide genomics resources towards the cloning of the causal genetic features of marsh spot in beans.

SNP Identification
A total of 13,064,398 paired-end genotyping by sequencing (GBS) reads corresponding to 196 Mb were generated from the sequencing of the 138 recombinant inbred lines (RILs). Considering a genome size of 473 Mb [31], the average genome coverage was 3.65X per line, ranging from 0.12X to 11.97X. To identify the parental origin of the variants identified in the RILs, the two parents were sequenced at a high coverage depth of 32.89X for Cran09 and 29.45X for Messina. An average of 78.56% of the reads of the RILs were aligned to the Andean type common bean genome G19833 reference genome (V2.1) [31], ranging from 66.54% to 82.70% (Table S1).
A total of 54,620 single nucleotide polymorphisms (SNPs) were identified by aligning GBS reads of the 138 RILs to the reference genome (V2.1) [31]. Filtering for minor allele frequency (MAF) > 0.01 and call rate > 20% yielded a total of 2066 SNPs. In addition, eight SNPs mapped to small scaffolds and were removed (Table S2). The SNPs were distributed across the whole genome, with an average of 188 SNPs per chromosome (Chr) ( Figure S1). Some regions on Chr 2, 3, 4, 6 displayed high-density SNP regions ( Figure S1). Among the 2058 SNPs, 1863 SNPs were polymorphic between parents (Cran09 and Messina), and 195 SNPs had no call in one of the parents (Table 1). Then, 785 SNPs that had significant segregation distortion at a 0.05 probability level were eliminated. Finally, 1273 SNPs were further imputed and used for linkage map construction ( Figure 1).   x ± s: mean genetic distance between markers ± standard deviation (cm).

Genetic Map
A genetic map of the 11 linkage groups or chromosomes was constructed containing 1273 SNP markers ranging from 9 on Chr 8 to 360 on Chr 5. Most SNPs that were identified on the same chromosomes on the reference sequence were grouped into the same linkage groups (Table S3) and showed consistent orders in the physical chromosomes ( Figure S2). The map consisted of 423 recombination intervals with a total length of 1599 cm and an average interval of 3.78 cm (Table 1). Since only nine markers were retained on Chr 8 after removing SNPs of significant segregation distortion, a large average interval (36.11 cm) between markers was obtained.

Genomic Heritability
The genomic heritability (h 2 ) of common bean resistance to marsh spot was estimated for MSRI using the genetic additive variance of all SNPs and phenotypes by GBLUP. The h 2 estimates ranged from 12.07% to 55.91% in all 18 datasets with the highest h 2 (55.91%) originating from the overall mean dataset of MSRI (Table 2).

Mapping of Additive QTL
Using two genetic map-based statistical models (ICIM-ADD and GCIM) and the haplotype block-based genome-wide association study (GWAS) model RTM-GWAS, a total of 18 QTL were identified from 18 phenotypic datasets. The QTL identified using different models were grouped into single QTL because they co-located on chromosomes or were within the same haplotype block. To validate the QTL identified by the different statistical models and from different phenotypic datasets (environments), we performed single factor (alleles) ANOVA for each identified QTL using the 18 phenotypic datasets. Two of the 18 QTL had no significant allelic differences in QTL effects in >15 datasets and were removed. The 12 stable QTL presented significant QTL effects in most of the phenotypic datasets (>10) with the mean R 2 ranging from 6.81% (QTL.6.1) to 24.52% (QTL.3.1), whereas the remaining four QTL (QTL.1.1, QTL.5.1, QTL.6.2 and QTL.9.1) explained 5.9-7.8% of phenotypic variation in three to five phenotypic datasets, indicative of environment-specific features (Tables 3 and S4, Figure 2).
Of the sixteen QTL, one was located on Chr 1, six on Chr 2, two on Chr 3, four on Chr 5, two on Chr 6, and one on Chr 9 (Table 3). One QTL was identified by a single SNP, or a quantitative trait nucleotide (QTN). In all sixteen QTL, two QTL were detected by three models, seven QTL by two models, and seven QTL by only one model. The LOD value of a QTL represents its significance extent. The LOD values for QTL identified from ICIM-ADD and GCIM varied from 3.14 to 7.58. Thirteen out of sixteen QTL had relatively high absolute values of additive effects (≥0.1) ranging from 0.1 to 0.57 (Table 3).

Mapping of Additive QTL
Using two genetic map-based statistical models (ICIM-ADD and GCIM) and the haplotype block-based genome-wide association study (GWAS) model RTM-GWAS, a total of 18 QTL were identified from 18 phenotypic datasets. The QTL identified using different models were grouped into single QTL because they co-located on chromosomes or were within the same haplotype block. To validate the QTL identified by the different statistical models and from different phenotypic datasets (environments), we performed single factor (alleles) ANOVA for each identified QTL using the 18 phenotypic datasets. Two of the 18 QTL had no significant allelic differences in QTL effects in >15 datasets and were removed. The 12 stable QTL presented significant QTL effects in most of the phenotypic datasets (>10) with the mean R 2 ranging from 6.81% (QTL.6.1) to 24.52% (QTL.3.1), whereas the remaining four QTL (QTL.1.1, QTL.5.1, QTL.6.2 and QTL.9.1) explained 5.9-7.8% of phenotypic variation in three to five phenotypic datasets, indicative of environment-specific features (Tables 3 and S4, Figure 2).

Mapping of Epistatic QTL
The additive-epistasis model ICIM-EPI was used to detect interactions among QTL. A total of three QTL pairs with significant epistatic effects were identified, involving two additive QTL identified using additive models: QTL.2.3 and QTL.5.4 (Table 4, Figure 3 (Table 3). One QTL was identified by a single SNP, or a quantitative trait nucleotide (QTN). In all sixteen QTL, two QTL were detected by three models, seven QTL by two models, and seven QTL by only one model. The LOD value of a QTL represents its significance extent. The LOD values for QTL identified from ICIM-ADD and GCIM varied from 3.14 to 7.58. Thirteen out of sixteen QTL had relatively high absolute values of additive effects (≥0.1) ranging from 0.1 to 0.57 (Table 3).

Mapping of Epistatic QTL
The additive-epistasis model ICIM-EPI was used to detect interactions among QTL. A total of three QTL pairs with significant epistatic effects were identified, involving two additive QTL identified using additive models: QTL.2.3 and QTL.5.4 (Table 4, Figure 3). QTL. 5  . Box plots of marsh spot resistance index (MSRI) in terms of the combinations of favorable alleles of three quantitative trait locus (QTL) pairs identified using the ICIM-EPI model. In each box plot, the black dots represent data points and the red dot represents the mean of the data points. ff: both QTL 1 and QTL 2 were favorable alleles. uf: QTL 1 was an unfavorable allele while QTL 2 was a favorable allele; fu: QTL 1 was a favorable allele but QTL 2 was an unfavorable allele; uu: both QTL 1 and QTL 2 were unfavorable alleles. . Box plots of marsh spot resistance index (MSRI) in terms of the combinations of favorable alleles of three quantitative trait locus (QTL) pairs identified using the ICIM-EPI model. In each box plot, the black dots represent data points and the red dot represents the mean of the data points. ff: both QTL 1 and QTL 2 were favorable alleles. uf: QTL 1 was an unfavorable allele while QTL 2 was a favorable allele; fu: QTL 1 was a favorable allele but QTL 2 was an unfavorable allele; uu: both QTL 1 and QTL 2 were unfavorable alleles.
Despite the significant interactions between pairs of QTL, on average, the number of favorable alleles of individuals tended to be positively correlated with MSRI ( Figure S4).

Contribution of All Detected QTL to Marsh Spot Resistance
In examining only additive effects, multiple linear regression models of all 16 QTL for each of 18 phenotypic datasets were constructed to calculate the overall contribution of all QTL to the phenotype variation. The R 2 of the model represents the portion of the phenotypic variation explained by all 16 QTL. The R 2 estimates of the 18 regression models ranged from 46.08% (S2019) to 75.37% (overall dataset), with a mean R 2 of 61.98% ( Figure 4). However, when both additive and epistatic effects of the QTL were considered, i.e., the additional two QTL that were influenced by the significant epistatic effects of three of the 16 QTL, the R 2 estimates of the models increased and ranged from 56.21% to 81.87% with a mean R 2 of 69.64% (Figure 4). Despite the significant interactions between pairs of QTL, on average, the num favorable alleles of individuals tended to be positively correlated with MSRI (Figure

Contribution of All Detected QTL to Marsh Spot Resistance
In examining only additive effects, multiple linear regression models of all 16 for each of 18 phenotypic datasets were constructed to calculate the overall contrib of all QTL to the phenotype variation. The R 2 of the model represents the portion phenotypic variation explained by all 16 QTL. The R 2 estimates of the 18 regression m ranged from 46.08% (S2019) to 75.37% (overall dataset), with a mean R 2 of 61.98% (F 4). However, when both additive and epistatic effects of the QTL were considered, i. additional two QTL that were influenced by the significant epistatic effects of three 16 QTL, the R 2 estimates of the models increased and ranged from 56.21% to 81.87% a mean R 2 of 69.64% ( Figure 4).  ). The remaining QTL had relativel RCs (ranging from 0.3 to 6.79%). However, due to partial correlation among QTL, th and mean R 2 values of QTL were not always consistent ( Figure 5). For example, QT and QTL.3.2 had the highest mean R 2 values (24.5% and 21.8%, respectively); how their mean RC values over the 18 datasets were not the highest (7.91% and 7 respectively) ( Figure 5, Table S5). The relative contribution (RC) of each QTL to MSRI values estimated in the datasets is listed in Table S5. QTL.5.4 which also had a significant epistatic effect with QTL.2.7 and QTL.2.8 had the largest RC (13.48%), followed by QTL.2.1 (10.4%), QTL.5.3 (9.44%), QTL.2.6 (8.9%), QTL.3.2 (8.26%), QTL.3.1 (8.14%). The remaining QTL had relatively low RCs (ranging from 0.3 to 6.79%). However, due to partial correlation among QTL, the RC and mean R 2 values of QTL were not always consistent ( Figure 5). For example, QTL.3.1 and QTL.3.2 had the highest mean R 2 values (24.5% and 21.8%, respectively); however, their mean RC values over the 18 datasets were not the highest (7.91% and 7.93%, respectively) ( Figure 5, Table S5).

Favorable Alleles of QTL in RILs
The number of favorable alleles of the 16 additive QTL (Table 3) in each RIL is illustrated in Figure 6. The number of favorable alleles was highly correlated with MSRI values (R 2 = 72.48%) (Figure 7). To further validate this relationship, the overall dataset of the 15 most resistant lines (0.01 ± 0.01 of MSRI) and the 15 most susceptible lines (0.66 ± 0.19 of MSRI) was extracted. The number of favorable alleles of the 15 most resistant lines (11.5 ± 1.8) was significantly higher than that of the 15 most susceptible lines

Favorable Alleles of QTL in RILs
The number of favorable alleles of the 16 additive QTL (Table 3) in each RIL is illustrated in Figure 6.     The number of favorable alleles for all three pairs of epistatic QTL identified by ICIM-EPI were also counted in the RILs. Interestingly, a significant correlation between the number of favorable alleles and MSRI was still observed, indicating that the additive effects existed despite the significant epistatic effects of these QTL ( Figure S4).  The number of favorable alleles for all three pairs of epistatic QTL identified by ICIM-EPI were also counted in the RILs. Interestingly, a significant correlation between the number of favorable alleles and MSRI was still observed, indicating that the additive effects existed despite the significant epistatic effects of these QTL ( Figure S4). The number of favorable alleles for all three pairs of epistatic QTL identified by ICIM-EPI were also counted in the RILs. Interestingly, a significant correlation between the number of favorable alleles and MSRI was still observed, indicating that the additive effects existed despite the significant epistatic effects of these QTL ( Figure S4).

Candidate Genes of Major QTL
Since marsh spot disease is most likely caused by Mn deficiency, all potential candidate genes associated with Mn deficiency and Mn content in plants were screened [16]. A list of 151 annotated genes likely related to Mn deficiency or Mn content were identified from other plant species, such as Arabidopsis, rice (Oryza sativa) and barley (Hordeum vulgare) and mapped to the common bean reference genome. Then, a window of upstream and downstream 100 Kb flanking each QTL region was scanned to identify QTL harboring such genes. Six of the QTL co-located with a total of 13 genes (Table 5). Among them, four QTL (QTL.1.1, QTL.3.1, QTL.3.2, and QTL.5.2) harbored eight genes encoding heavy metal transport/detoxification superfamily protein. In particular, two QTL (QTL.3.1 and QTL.3.2) which explained the highest phenotypic variation of 12.2-24.5% harbored five metal transport genes Phvul.003G086300, Phvul.003G092500, Phvul.003G104900, Phvul.003G099700, and Phvul.003G108900 in a large genomic region of 16.8-27.5 Mb on Chr 3. Other gene families, including one zinc transporter (ZIP), ZIP metal ion transporter, cation efflux family, natural resistance-associated macrophage protein (NRAMP), and manganese tracking factor for mitochondrial SOD2 were also co-located to QTL.2.3, QTL.5.2, or QTL.9.1 (Table 5). These genes are responsible for Mn transporter, metals homeostasis, and detoxification in plants and are very likely to be causal genes controlling marsh spot.

Discussion
The previous investigation revealed the inheritance of marsh spot resistance in cranberry common beans was likely controlled by at least four genes with additive and epistatic effects [19]. Using joint segregation analysis (JSA) [33,34], the marsh spot phenotypic data of the RIL population best fitted a genetic model with four major genes with additive and interaction effects. The estimated epistatic effects were even larger than additive effects. In this study, using the same population and phenotypic datasets, we identified 16 additive and three pairs of epistatic QTL, validating and confirming that marsh spot resistance is a quantitatively inherited trait controlled by at least four genes. Due to the theoretical limitation of the maximum gene number of the JSA genetic models, a maximum of four major genes can be estimated [33]. In addition, the current version of JSA can only estimate major genes without minor gene effects; thus, the number of major genes was underestimated and the effects from minor genes were ignored. The current study further validated the previous results, extended the discovery, and proposed candidate genes for the QTL that support the role of Mn in marsh spot disease [19].

Genomic Heritability and Contribution of QTL to Marsh Spot Resistance
To understand the overall contribution of the identified QTL, multiple linear regression models of all QTL on each of the 18 phenotypic datasets were constructed. The average R 2 of the models with both additive and epistatic QTL was greater than that of the models with only additive QTL, indicating that the additional epistatic QTL were also useful for improving marsh spot resistance despite some negative interaction between some epistatic QTL ( Table 4).
The lowest R 2 values were always obtained from the phenotypic data sets of single years with one soil type (2019/Sandy soil and 2018/Heavy clay soil). The R 2 values of the mean value datasets (such as overall means, means of heavy clay over five years and means of sandy soil over five years) were greater than those from the single environments (single year and single soil type), showing the strong environmental effects on QTL. Of all identified QTL, most were detected from the overall mean dataset or several other mean datasets. Thus, the phenotypic data over multiple years and/or multiple locations helps to identify stable QTL [35].
In the overall mean dataset, genomic heritability (h 2 ) was estimated to be 55.91%, indicating that the 1273 SNPs identified in the RIL population explained more than half of the phenotypic variance. The missing proportion of the phenotypic variation could be the result of the non-additive effects of SNP markers or missed SNPs in marker-poor regions [36]. The h 2 estimates also varied from different phenotypic datasets. For example, the h 2 was extremely low (12.07%) in the 2019/sandy soil dataset (S2019). MSRI observations in S2019 were significantly lower than those of other datasets possibly due to the higher concentration of Mn in the sandy soil field in 2019. Thus, the estimation of h 2 is possibly affected by the interaction between genetic background and environment [37].
The overall R 2 values of all the identified QTL estimated in the regression models (75.37%) were higher than the h 2 estimates (55.91%) ( Table 2). The difference between the two estimates may be because different statistical models were used but this result implied that most of the QTL associated with marsh spot resistance existing in this RIL population may have already been identified using a combination of different statistical models.

Statistical Models for QTL Identification and QTL Validation
Each statistical model for QTL mapping has its own advantages and limitations. The simultaneous utilization of multiple models would be a reasonable and practical strategy to make full use of their merits and overcome potential disadvantages. Several statistical models have been developed for QTL mapping in bi-parental populations, involving linkage map-based models and GWAS models [35,[38][39][40]. In this study, four statistical models were used to identify QTL, including three linkage map-based models (i.e., ICIM-ADD, ICIM-EPI and GCIM) and one haplotype block-based GWAS model (i.e., RTM-GWAS). Unlike traditional interval mapping (IM) and composite interval mapping (CIM) models, the ICIM model can control polygenetic background through a prerequisite selection of markers in QTL mapping. Those polygenes with large and moderate effects were well controlled to reduce the rate of false positives [41,42]. GCIM provides a new method to control polygenetic background by estimating polygenetic variance in GWAS, which can control the background of polygenes with large, moderate and small effects. Compared with ICIM models, GCIM outperformed ICIM in small effect QTL detection. However, in some cases in the GCIM model, several peaks around one QTL could be identified at the same time, thus the true QTL was difficult to define. For example, four QTL, QTL.2.3, QTL2.4, QTL.2.5 and QTL.2.6 neighboured each other on the genetic map and on the chromosome in terms of their physical chromosomal locations (Table 3). They spanned a genomic region of 34.0-38.4 Mb or 128.9-188.3 cm on the genetic map. RTM-GWAS first generates and groups SNPs into LD blocks, and then QTL mapping is performed based on these LD blocks, through a process called SNPLDBs [43]. The employment of LD blocks as markers can notably decrease the possibility of false positives during multiple hypotheses in the GWAS model [40]. In this study, all 16 additive QTL were identified using three genetic map-based models (Table 3), demonstrating their detection power in QTL mapping [44][45][46], whereas six of them were validated by RTM-GWAS. These results indicate that the value of using multiple models that combine the genetic map-based and GWAS models facilitates the detection of QTL with small effects and the validation of QTL.
Further validation of the QTL can also be achieved through the use of linear regression models as performed herein where significant correlations between QTL alleles and phenotypes of RILs were shown. Although significant correlations were confirmed for most of the putative QTL identified by statistical models, two of the 18 original QTL did not pass the significance test and were declared false positive QTL.
While all statistical models may result in some false positive QTL, the use of multiple QTL models and other validation methods such as the linear regression models can be capitalized upon to identify them.

Additive/Epistatic QTL and Genomics-Assisted Selection
With the development of genotyping technologies, genomics-assisted selection such as marker-assisted selection (MAS) and genomic selection (GS) has been widely used for the selection of traits controlled by major genes or polygenes in many crops including the common bean [47][48][49][50]. MAS and GS aim at predicting the phenotypes of individuals based on the use of known molecular marker information without expensive or time-consuming phenotyping of the individuals. MAS tends to select superior lines through major genes or large-effect QTL, while GS utilizes high-density genome-wide markers or QTL to predict the performance of individuals.
In this study, a total of 16 additives and three pairs of epistatic QTL have been identified. These QTL explained most phenotypic variations for marsh spot resistance. The accumulation of favorable alleles in RILs via the hybridization of two parents and recombination has greatly improved common bean resistance to marsh spot. The most resistant RILs had significantly greater favorable additive alleles than the most susceptible RILs ( Figure 8B). The number of favorable alleles in a RIL had a significantly positive correlation with marsh spot resistance (i.e., negative correlation with MSRI) (Figure 7), showing a significant increase in the number of favorable alleles from susceptible to resistance lines ( Figure 6). MAS or GS are both effective in pyramid favorable alleles of QTL to develop future resistant cultivars in plant breeding. The lines containing more favorable alleles especially those of QTL with high R 2 and relatively large contributions will be preferentially selected in breeding. In this study, QTL.5.3 and QTL.5.4 had the highest RC while QTL.3.1 and QTL.3.2 had the highest R 2 values (Figure 3). These four QTL could be taken into consideration for future breeding. For those epistatic QTL, additive effects still contributed resistance to marsh spot disease. Therefore, in a breeding program, the epistatic effect of QTL markers should also be considered in the selection of molecular markers for optimal QTL combinations.
A total of 1273 SNP markers identified from the RIL population were used for QTL mapping in this study. The estimates of genomic heritability for marsh spot resistance indicate that more than half of the phenotypic variation can be explained using these SNPs, providing the potential to perform GS to improve marsh spot resistance in common bean breeding.

Candidate Gene Prediction
Although QTL mapping or GWAS have been widely used to identify QTL or QTNs associated with traits of importance and to predict potential candidate genes, their applications were limited. First of all, the prediction of candidate genes relies on whether the detected QTL/QTNs are not false positives. Then, the application of designating a potential candidate gene of a QTL depends on many other factors, such as the number of markers used, marker density on chromosomes, the recombination rate of genomic regions, and so on. To date, the most popular and simple approach for predicting candidate genes is to investigate the annotated genes in the vicinity of the QTL, such as a window of a specific physical distance flanking the QTL [51][52][53]. Although functional validation is the ultimate goal, candidate gene prediction of QTL/QTNs based on chromosomal location combined with a priori knowledge of gene functions has the potential to significantly narrow down the candidate gene list. This approach requires a list of annotated genes associated with the traits that have been validated to some extent in previous studies.
Marsh spot symptoms are likely caused by Mn deficiency due to the low availability of soil Mn, a limited capacity for Mn uptake and transport, and/or because of interference from other physiological pathways involving Mn, such as deoxidation [16,18,[54][55][56][57]. Our previous study indirectly showed that Mn concentration in soil may be associated with the development of marsh spot in cranberry beans [19]. Here, candidate genes with the possible function of Mn transporter and deoxidation were annotated. Two Mn transporter protein-coding genes are co-located to QTL 5.2: one is the zinc transporter (ZIP) coding gene Phvul.005G048900, and the other one is the cation efflux (CAX) family protein-coding gene Phvul.005G049300. ZIP family members are involved in the transport of Mn in stellar root cells and present in the tonoplast, and took part in remobilizing Mn from the vacuoles to the cytoplasm [58]. The members of the CAX family are metal transporters and mainly control the influx of cations into the vacuole [59]. CAX-like transporters were found in other species, such as LeCAX2 in tomato (Solanum lycopersicum L.), and HvCAX2 in barley (H. vulgare). They are expressed ubiquitously in the roots, shoots, immature spikes and seeds [29].
Mn, naturally plentiful in most soils, should be adequately available to plants. However, deficiencies occur when in soils with high pH, high organic matter or during cold and wet conditions. Some of the identified candidate genes could play a role in Mn regulation in plants. Such as the ZIP gene mentioned above, the expression of the ZIP gene could allow plants to absorb more Mn from soil or remobilize more Mn to seeds during germination, thus, the development of marsh spot in seeds could be prevented or reduced.

Recombinant Inbred Lines (RILs)
An F 2:7 RIL population of 138 individuals derived from a cross between the marsh spot susceptible cultivar "Messina" and the highly-resistant cultivar 'Cran09' was generated [19]. F 2 plants were selfed and propagated by single seed descent to the F 7 generation to ensure a high percentage (>98%) of homozygosity.

Phenotyping of Marsh Spot Resistance
From 2015 to 2019, the 138 RILs and their two parents were evaluated for marsh spot severity in sandy and heavy clay soils as previously described [19]. Briefly, the field trials were conducted in a partially balanced lattice design with three replications at the Morden Research and Development Centre, Morden, Manitoba, Canada (49 • 11 N, 98 • 5 W). Each line was planted in a 5 m-long row with 75 cm spacing between rows, and herbicides and fertilizers were applied to ensure optimal growth following standard commercial production guidelines. After harvest, ten seeds were randomly selected from each line and rated for marsh spot severity using a 0 to 5 scale, where 0 indicates no symptoms and 5 represents the most severe symptoms. The marsh spot resistance index (MSRI) was used to estimate the severity of the disease for each of the RILs: MSRI = ∑ n i=0 (number of seeds at a rating with 0 − 5 scale × the rating) Total number of seeds where n is the total number of ratings and i = 0, 1, . . . , 5, respectively. A total of 18 phenotypic datasets were collated: ten for each combination of the five years and two soil types, five for the means of each year over the soil types, two for the means of soil types over the five years and one for the means overall years and soil types. Statistical illustrations were drawn using the R package 'ggplot2' (https://cran.r-project. org/web/packages/ggplot2/index.html, (accessed on 1 May 2021)). The detailed ratings for marsh spot and the analyses of the phenotypic data were carried out as previously described [16,19].

Genotyping by Sequencing and SNP Identification
Seeds of the individual RILs along with the parental lines were grown in a growth chamber. At the 2-leaf stage, 75 mg of leaf tissue was sampled and flash-frozen in liquid nitrogen before being lyophilized in a FreeZone benchtop freeze-dryer (Labconco, Kansas City, MO, USA). Genomic DNA was extracted using the DNeasy 96-well kit (Qiagen, Germantown, MD, USA) and quantified with the Quant-iT™ PicoGreen™ dsDNA assay kit (ThermoFisher, Waltham, MA, USA) following the manufacturer's instructions. The DNA samples were diluted to 20 ng/µL, and 10 µL of each sample was used for library construction.
The library preparation and sequencing service was provided by the Centre d'expertise et de services Génome Québec (Montréal, QC, Canada). The GBS library was constructed for each of the 138 RILs and ten libraries each for the parents, for a total of 158 libraries. Library construction was conducted at the Institute of Integrative Biology and System (IBIS, Université Laval, Québec, QC, Canada) using the MspI/PstI restriction enzyme combination as previously described [60]. The 158 indexed libraries were pooled and sequenced on 20 35 M-read NovaSeq 6000 lanes using the paired-end 150 bp (PE150) mode at the Centre d'expertise et de services Génome Québec.
As the cranberry common bean belongs to the Andean gene pool [61], the common bean reference genome v2.1 of Andean type landrace G19833 [31] was used as a reference for SNP discovery. The generated raw read data were filtered using the AGSNP pipeline [62] for standard quality and aligned to the reference genome using the Burrows-Wheeler Alignment tool (BWA V0.78-r455). Variant detection was performed using SAMTools V1.15.1 [63]. The entire procedure was implemented in the updated custom GBS analysis pipeline [64,65]. As a quality check, only SNPs that were polymorphic between parents and that weresegregated in the RIL population were selected. Then, SNPs with MAF > 0.01 and call rate > 20% were retained. Missing SNPs were then imputed using Beagle V5.1 [35] to produce the SNP dataset for linkage map construction. SNPs were assigned to LD blocks (D > 0.8) using the R package gpart V1.13.0 (http://bioconductor.org/packages/release/ bioc/html/gpart.html, (accessed on 2 June 2021)) and one representative SNP was chosen to represent each block. Because the construction of a genetic linkage map and the detection of QTL may be influenced by segregation distortion, the Chi-square test in IciMapping V4.2 (https://isbreeding.caas.cn/rj/qtllcmapping/294445.htm, (accessed on 24 July 2019)) [66] was used to evaluate the significance of segregation ratios and SNPs that significantly deviate from the expected 1:1 ratio (p < 0.05) were excluded.

Genomic Heritability
Genomic heritability (h 2 ), representing the proportion of additive genetic variance component of the total phenotypic variance, was estimated for all SNPs using the R package 'sommer' V4.1 (https://cran.r-project.org/web/packages/sommer/index.html, (accessed on 5 January 2021)) with the genomic best linear unbiased prediction (GBLUP) model.

Construction of Linkage Map
Construction of the linkage map was performed using QTL IciMapping V4.2 software [66]. The SNPs were divided into linkage groups based on their physical positions on chromosomes and subsequently ordered based on their recombinant frequencies. A maximum recombination frequency of 0.35 centimorgan (cm) was used from three criterion options. Genetic map distances were estimated using the Kosambi mapping function [67]. The linkage groups were assigned to their corresponding chromosomes based on the SNPs identified on the reference genome [31].

QTL Identification
IciMapping V4.0 [66] with the inclusive composite interval mapping (ICIM) and composite interval mapping (CIM) models was used for QTL mapping of the RIL population. In ICIM, forward and backward stepwise regressions were first computed, and Expectationmaximization (EM) iterations were then applied to consider all markers simultaneously. The additive (ICIM-ADD) and additive + epistatic (ICIM-EPI) models were used to detect QTL with additive and/or epistatic effects, respectively.
Genome-wide composite interval mapping (GCIM) [45] was also used to detect the large and small effects of QTL. The GCIM model includes two steps. The first involves the scanning of putative QTL across the genome using a single-locus random mixed linear model used in GWAS, and the second is the integration of the selected putative QTL into a multi-QTL mixed linear model. The QTL effects were calculated by the empirical Bayes method with the likelihood ratio test employed on true QTL detection [45]. GCIM was implemented using the R package QTL.gCIMapping.GUI V2.1.1 (https://cran.r-project. org/web/packages/QTL.gCIMapping.GUI/index.html, (accessed on 10 December 2021)).
Permutation tests of 1000 iterations [68] under the type I error α = 0.05 was performed to obtain the LOD scores to be used as thresholds of significance for QTL detection.
A haplotype block-based GWAS method, RTM-GWAS (restricted two-stage, multilocus, multi-allele GWAS) V2020.0 [43], was also employed to detect QTL regions. RTM-GWAS first groups all SNP markers that shared strong linkage disequilibrium (LD) (D' > 0.8) into LD blocks, and then uses those LD blocks for QTL detection. A significance level of 0.05 was used for the pre-selection of individual candidate markers under the single locus model, and an experiment-wide significance level of 0.05 was used for the stepwise regression to declare the significant QTL under the multi-locus model.
One way-ANOVA was performed for each QTL to further test the statistical significance of MSRI between QTL alleles in all 18 phenotypic datasets. QTL were considered as single fixed factors with two or more alleles. In addition, the R 2 of each QTL was estimated as the proportion of phenotypic variation in the RIL population explained by alleles of the QTL. A higher R 2 value indicates the QTL has a stronger effect on the marsh spot resistance. The average R 2 of a QTL was calculated using R 2 values that were statistically significant in all 18 datasets.
To calculate the relative contribution (RC) of each QTL to total phenotypic variation, the relative R 2 value of each QTL was calculated based on a linear model containing all QTL using the R package 'relaimpo' V2.2 (https://cran.rstudio.com/web/packages/relaimpo/ index.html, (accessed on 10 December 2021)). To evaluate the overall contribution of all detected QTL, the likelihood-ratio-based R 2 between MSRI and all identified QTL in all 18 datasets were estimated using R package 'MuMln' V1.46.0 [32].

Favorable Alleles
To determine the number of favorable alleles of each RIL, the mean MSRI of individuals with the same alleles was calculated for each allele of the identified QTL. If a QTL spans more than two SNPs, the number of alleles present at a QTL may be greater than two. Theoretically, no recombination between SNPs within a QTL region is expected, and eventually, most QTL had only two alleles. The recombinant alleles had very low frequencies in the population. These recombinant alleles may be due to rare recombination events between SNPs or result from the errors of SNP imputation. Therefore, only two alleles with the highest frequencies were considered for each QTL. The allele with a high mean MSRI value was assigned a favorable allele, whereas another allele with a low mean MSRI value was assigned an unfavorable allele. For each of the RILs, the total number of favorable alleles for all QTL were counted.

Candidate Gene Prediction
To predict the candidate genes associated with marsh spot disease resistance, a list of gene families related to Mn transport, Mn efficiency or Mn content in plants was compiled (Table S6) [16] based on the common bean reference genome sequence [31]. The protein sequences of Mn-deficiency-related gene families from common bean, Arabidopsis, rice (Oryza sativa) and barley (Hordeum vulgare L.) (https://www.ncbi.nlm.nih.gov/guide/ proteins/, (accessed on 1 May 2021)) were extracted and BLAST (basic alignment search tool) [69] searches were performed against the common bean reference genome sequence to locate the bean orthologous sequences. A total of 154 candidate genes were eventually identified, belonging to ten gene families. A genome-wide scan was performed to identify the ones located within 100 Kb of QTL to constitute the list of candidate genes co-located with the identified QTL.

Conclusions
QTL mapping in this study determined that marsh spot resistance in cranberry common bean is a highly heritable trait that is genetically controlled by multiple genes. Although four gene loci with additive and epistatic effects have been detected through joint segregation analysis at a phenotype level as reported previously [19], sixteen additive and three pairs of epistatic QTL were further identified via QTL mapping at the genomic level in this study. These QTL explained up to 81% of phenotypic variation. Despite epistatic effects between some QTL, there existed a significant correlation between the number of favorable alleles of the additive QTLs and marsh spot resistance (R 2 = 72%), which confirmed that the favorable alleles of these QTL are additive and can be pyramided in future common bean cultivars by MAS. These QTL will facilitate the development of molecular markers for resistance breeding. In addition, 13 candidate genes related to Mn deficiency or Mn content in plants were shown to be co-located within six QTL regions. Those genes will be further validated in future functional genomic studies to determine their potential to improve marsh spot resistance in germplasm or new cultivars by adopting modern genetic improvement techniques such as genome or gene editing.