Whole-genome sequencing and phenotyping revealed structural variants and varied level of resistance against leaf crumple disease in diverse lines of snap bean (Phaseolus vulgaris)

1 Department of Plant Pathology, Coastal Plain Experimental Station, University of Georgia, Tifton, Georgia, United States of America 2 Department of Entomology, 1109 Experiment Station, Griffin, University of Georgia, Griffin, Georgia, United States of America 3 Department of Horticulture, Coastal Plain Experimental Station, University of Georgia, Tifton, Georgia, United States of America U.S. Vegetable Laboratory, Agricultural Research Service, United States Department of Agriculture, Charleston, SC * Corresponding author: bhabesh@uga.edu Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 25 October 2020 doi:10.20944/preprints202010.0501.v1


Introduction
Among commonly grown Phaseolus sp., Phaseolus vulgaris L. (common bean, snap bean) is an annual legume crop with a diploid genome size of 521.1 Mb (2n = 22) [1]. Snap bean is one of the most important affordable food legumes for humans [2], which is consumed by over 80 million poor people in regions of Latin America, the Caribbean, and Eastern and Southern Africa. In the U.S., snap bean is an important horticultural crop, especially for the state of Georgia where snap bean is grown in 9,979 acres and generates an annual revenue of $24 million dollars [3]. However, the production and quality of snap bean have been negatively impacted by two whiteflytransmitted begomoviruses namely, cucurbit leaf crumple virus (CuLCrV) and sida golden mosaic Florida virus (SiGMFV), which often appear as a mixed infection in Georgia.
CuLCrV is a bipartite begomovirus first identified in watermelon in the Imperial Valley of southern California in 1998 [4] and in Georgia in snap beans in 2009 [5]. In August 2018, snap beans with characteristic begomovirus infection symptoms (crumpled, curled, and thickened leaves) were found in Tifton, Georgia, and these plants were heavily infested with whiteflies. Subsequent analysis with degenerate and specific begomovirus primers revealed the presence of SiGMFV in infected plant tissues. In the Southeastern United States, SiGMFV (a bipartite virus) was first reported in Florida in 2006 on snap beans with infected plants displaying leaf mottling, puckering, and severe curling symptoms [6]. Both the viruses are transmitted by the sweet-potato whitefly (Bemisia tabaci Gennadius). Currently, leaf crumple disease management is centered around vector control, which usually occur via insecticides. Disease management via vector control is unreliable and insufficient. On the contrary, host-resistance is the more economical and sustainable approach that can potentially minimize fields infestation, but there is considerable lack of information on host resistance against these two begomoviruses on snap bean in the United States. It is imperative to understand the genetics of host resistance, which involves identifying markers and genes for breeding highly resistant snap bean varieties.
Single nucleotide polymorphisms (SNPs) have high frequency of occurrence throughout the genome and are considered as preferred genetic marker in breeding for disease resistance. SNPs along with longer sequence variants; insertions and deletions (InDels), aided in discovery of quantitative trait loci (QTLs) and genes associated with disease resistance and agronomic traits in many cultivated crops. Prior genomic studies on P. vulgaris (dry beans) focused on agronomic and abiotic stress related traits (drought stress, salt tress) but none of them focused on identifying resistance to viral pathogens. Biparental QTL mapping and genome wide association studies (GWAS) have been used to discover such traits in common bean [7][8][9]. In the current study, beside evaluating the response of snap beans to natural infection of begomoviruses under field conditions for two consecutive years (seasons), sequence variants (SNPs and Indels), and their distribution in the Phaseolus cultivars were also identified using whole-genome sequencing (WGS).

Plant Materials
Eighty-four Phaseolus genotypes including 82 snap bean and two Lima bean (P. lunatus) genotypes were used in 2018. Two Lima bean genotypes that are close relatives of snap beans, Jackson wonder and Fordhook were also included. Eighty genotypes were tested in 2019 out of which seventy-six genotypes were the same as those tested in 2018. Seeds of BMN-RMR-13, Bronco 2, Lakatte, SB4734, SB4735, SB4744, SB4679, SV1137 were not available for evaluation in 2019. Hence, four genotypes of snap beans, Achiever, Blue Lake 274, Coyote and Greenback were evaluated only in 2019 ( Table 1). Seeds were collected from commercial seed companies and Germplasm Resource Information Network (GRIN) of the United States Department of Agriculture (USDA) ( Table 1).

Experimental design, layout and environmental conditions
Genotypes mentioned above were evaluated for resistance to CuLCrV and SiGMFV under field conditions at the University of Georgia, Tifton. In both years (2018 and 2019), seeds were grown in 12 individual 138 m long-raised beds. Each raised bed was divided into plots with dimension of 3.04 m × 0.91 m. Each plot was comprised 20 plants planted in a in-row spacing of 7.62 cm, double rows spaced at 46 cm were used within each bed. Treatments (genotypes) were replicated (r = 3) using a randomized complete block design. Natural whitefly infestation was relied upon virus transmission and resultant disease. All cultural practices and disease management followed the UGA Cooperative Extension recommendations [10]. Insecticides were not sprayed to ensure survival of whiteflies for disease incidence and spread. Averages of maximum and minimum temperatures in 2018 during the growing period were 34.5 °C and 21.1 °C, respectively with an accumulated precipitation of 0.25 cm. In the 2019 growing period, averages of maximum and minimum temperatures were 32.5 °C and 22.5 °C and an accumulated precipitation was 0.23 cm.

Response of Phaseolus sp. (snap beans and Lima beans) genotypes to leaf crumple disease in the field
In 2018, evaluation of genotypes for virus resistance was conducted at 30 days after sowing (DAS). Since Hurricane Michael destroyed the crop in early October, 2018, a second evaluation of resistance was not possible. In 2019, leaf crumple disease evaluation was conducted twice, at 30 and 45 DAS. For each genotype, plants were evaluated visually for disease incidence and severity. Disease severity in five randomly selected plants per plot per genotype was evaluated using a severity scale of 0 to 100. A plant with no crumpling, mosaic and stunting was scored as 0 ( Fig  1A). A plant with severe leaf crumpling, mosaic and stunting was scored as 100 (Fig 1B). Genotypes with disease severity ≤ 20% were rated as highly resistant, 21-50% as moderately resistant, 51-65% as susceptible and ≥65% as highly susceptible.

Whitefly count
Adult whiteflies were counted at 30 DAS in 2018 and 45 DAS in 2019 on each genotype. Counting was conducted on the lower side of leaves in the morning hours when whiteflies are not very active. Whiteflies adults were enumerated on top three, fully expanded leaves by gently turning the leaf over by tip. Whitefly counts were taken from 15 plants for each genotype, five from each replicate. Whitefly count data for 2018 and 2019 were analyzed using linear mixed techniques in software R version 3.4.2. Snap bean genotypes were considered as fixed effects and replicates were considered as random effects. To meet the assumption of ANOVA (normality and homoscedasticity of variance) prior to analysis data were log(X+1) transformed. Post-hoc analyses were performed using the 'emmeans' package with the default Tukey's honest significant difference test (p=0.05).

DNA isolation, library preparation, sequencing and quality filtering of raw data
A total of 82 Phaseolus genotypes (80 snap beans and two Lima beans) were sequenced. Total DNA was isolated from a single plant of each genotype collected arbitrarily from the field using DNEasy plant mini kit (Qiagen) following manufacturer's instructions. A 50ng/ µl of DNA per sample was used for library preparation as per the standard protocol. Genomic DNA of each sample was randomly sheared into short fragments of about 300-500 bp. The obtained fragments were subjected to library construction using the NEBNext® DNA Library Prep Kit following manufacturer's instructions. Library was subsequently sequenced and the raw FASTQ reads obtained were quality filtered. We discarded the paired reads: when either read contained adapter contamination; when uncertain nucleotides (N) constitute more than 10 percent of either read; when low quality nucleotides (base quality less than 5, Q ≤ 5) constitute more than 50 percent of either read.
The filtered sequencing data was aligned on Phaseolus vulgaris reference genome (Pvulgaris_442_v2.0_softmasked) available at legume information system (LIS). BWA software (parameters: mem -t 4 -k 32 -M) was used for alignment and the mapping rate and coverage were counted according to the alignment results. The duplicates were removed by SAMtools. Individual SNP variations were detected using GATK. SNPs and InDels were further filtered based quality and depth. All variants with Qual <30, SOR > 3.0, DP <6, heterozygous and multiallelic calls were filtered out. Filtered SNPs and InDels were annotated using Annovar [11]. SNP and InDel densities per kb were calculated in 100 kb bins all throughout the 11 chromosomes of P. vulgaris.

Confirmation of begomoviruses (CuLCrV and SiGMFV) infection associated with leaf crumple symptoms in Phaseolus sp.
In order to ensure if the symptoms observed were associated with begomoviruses, we tested Phaseolus sp. leaf samples from forty randomly collected genotypes from the field. A real time PCR (qRTPCR) using primers and protocol developed by Gautam et al. [12] was used. DNA samples tested positive for CulCrV and SiGMFV earlier were included as positive controls. Water was added in place of DNA in negative controls.

Response of Phaseolus sp. (snap beans and Lima beans) genotypes to leaf crumple disease in the field
In both years, typical symptoms of virus infection included yellow mosaic, leaf crumpling, and shortening in varying degrees in different genotypes (Figure 1, Supplementary figure S1). In 2018, each plant in the field was examined for visual symptoms and 100% of the genotypes had at least one symptomatic plant per plot. In 2019, data from only five plants were recorded individually for disease incidence. One hundred percent of the plants visually screened for each genotype had leaf crumple incidence; however, disease severity among genotypes varied considerably (Figure 1, Supplementary figure S1). None of the genotypes were symptomless or immune in both years tested. In 2018, out of 84 genotypes, 19 genotypes showed a high level of resistance to leaf crumple, 25 genotypes were moderately resistant, 11 genotypes were susceptible, and 29 genotypes were found to be highly susceptible ( Table 1).
In general, disease severities were higher in most of the genotypes in 2019 compared to 2018. Many genotypes that were resistant in 2018 were susceptible in 2019. At 30 DAS, disease severity was higher in most genotypes compared to 45 DAS. Sixteen snap bean genotypes classified as highly resistant in 2018 showed higher disease severities in 2019. In 2019, twenty-four snap bean genotypes were moderately resistant, 24 were susceptible and 31 genotypes were highly susceptible ( Table 1).
The two Lima bean genotypes (P. lunatus), Jackson Wonder and Fordhook had low disease severity in 2018 and 2019 ( Table 1). Eight Phaseolus genotypes were moderately resistant in 2018 and 2019 with disease severities ranging from 21% to 50% for Barron, Carson, Cedric Larson, Fordhook, Furano, Hastings white cornfield, Hmx 175724 and Wyatt.

Confirmation of begomoviruses (CuLCrV and/or SiGMFV) infection in Phaseolus sp.
In 2018 and 2019 both the viruses were detected in the field and were widely prevalent. In 2018 out of the 38 samples tested, at least one virus was detected in 11 samples and both viruses were detected in 27 samples. The CuLCrV was detected in 37 samples while SiGMFV was detected in 28 samples. In 2019, out of 40 samples tested, at least one virus was detected in 21 samples while both viruses were detected in 17 samples. The CuLCrV was detected in 18 samples while SiGMFV was detected in 36 samples.

Discussion
A total of 88 different Phaseolus genotypes were evaluated for natural resistance to CuLCrV and SiGMFV, with 84 evaluated in 2018 and 80 evaluated in 2019. There were 76 genotypes common in both the years. Further, the 82 genotypes were sequenced, and the SNP and InDel variants were identified. Overall, the aim was to identify both phenotypic variability (symptom severity to CuLCrV and SiGMFV) and diversity within the genomes of these genotypes. All genotypes displayed begomovirus-associated symptoms in the field suggesting that none of the genotype was immune. The disease severity ranged from 5 to 100%, indicating a considerable difference in disease resistance among the genotypes. However, we observed some inconsistencies for the phenotypic response of genotypes in 2018 and 2019. For example, the genotypes; Affirmed, Blush, Royal burgundy, Prevail and Tema showed high-to-moderate level of resistance against the leaf crumple disease in 2018 (severity: 5-23%). However, in 2019 the symptom severity for these genotypes ranged from 46-55%. This could be due to comparatively higher level of infestation with whiteflies in 2019 vs. 2018 resulting in presumably higher inoculation events with one and/or both begomoviruses. Moreover, percentage of genotypes that were mixed infected with both begomoviruses were higher in 2019 (38.5%) vs. 2018 (27.5%) and as per the previous observations these plants can display severe symptoms compared with when they are infected with either of the viral pathogens [13]. It is possible that more genotypes were mixed infected in 2019 than in 2018 resulting in severe symptoms as observed for the same genotypes earlier. Interestingly, the two Lima bean genotypes, Fordhook and Jackson wonder were resistant and highly-resistant in both years. Adult whitefly counts differed among the genotypes during both the years indicating a potential difference in preference to these genotypes or host-related factors that that repel whiteflies, which needs to be investigated further. Carefully planned extensive preference and biology experiments are required to fully comprehend the level of resistance of snap bean genotypes to whiteflies.
Next generation sequencing (NGS) technology particularly the WGS with downstream computational analyses have provided a quick and accurate method to discover genome-wide variations and identify marker-trait associations as exemplified in several other studies [14][15][16][17][18][19]. We therefore generated WGS data of 82 Phaseolus genotypes and aligned it on the P. vulgaris reference genome [1]. A wide variation in the total number of sequenced reads was observed (59.6 million to 146.1 million) with a mapping rate ranging from 55.19% to 97.19%. The low mapping rate of genotypes Fordhook and Jackson wonder is due to the fact that these two genotypes belong to P. lunatus; however, were mapped on to the P. vulgaris reference genome. The reason for low mapping could lie in the breeding history of these cultivars, which might have resulted in allelic admixture events in these nine P. vulgaris genotypes. With an average density of 125 SNPs, five insertions and seven deletions/100kb, variants were differentially distributed throughout the genome. There were several 100 kb bins on each of the 11 chromosomes that did not contain such variants. Despite having uniform genome coverage of mapped reads several empty bins were identified because of the stringent variant calling parameters used as indicated in the methods section. Such significant differential distribution of DNA polymorphisms has also been reported in Arabidopsis and rice [20][21][22].
Varshney et al. [23] reported SNP and InDel densities (per100 kb) of 63.3 SNPs and 38 InDels in cultivated chickpea, and 103.4 SNPs and 67.4 InDels in wild chickpea using 412 cultivated and seven wild chickpea genotypes. We observed a higher SNP but lower InDel density in our 82 genotypes when compared to a cool-season legume crop (chickpea). The variant density is expected to increase even further if we consider a larger set of genotypes for genotyping. This clearly indicate that Phaseolus has more genetic diversity than its cool season counterpart that can be deployed for breeding for disease resistance. The identified SNP density in this study (125/100 kb) is also comparable to a warm-season legume, soybean (~100 SNP/100 Kb) [24]. In our study, the ratio of non-synonymous to synonymous SNPs was found to be 0.66, which is less than the ratio observed in pigeon pea (Cajanus cajan; 1.18) [25], soybean (Glycine max.; 1.36) [24], rice (Oryza sativa; 1.18) [26], sorghum (Sorghum bicolor; 1.0) [27] and chickpea (Cicer arietinum; 1.20) [23]. Lesser ratio on our study indicate that synonymous substitutions in the studied Phaseolus genotypes are tolerated, but the non-synonymous substitutions are removed by purifying selection. It suggests that functionally constrained regions of genes evolve at a slower rate than regions that are not functionally constrained.
The Ts/Tv ratio is often used as a quality indicator of variation data produced from NGS experiments. A higher ratio is an indicator of good quality SNPs as sequencing errors and false positive variants have a ratio closer to one [28].We found SNP transitions (A/G and C/T) are the most common substitution in the genome, which is consistent with other crop species like foxtail millet (Setaria italica) [29], tea (Camellia sinensis) [30], soybean [31], rice [22]. We observed a Ts/Tv ratio of 1.71 is however less than the ratios reported in crops like rice [22], maize (Zea mays) [32] and tea [30]. The higher Ts/Tv could be because of more synonymous mutations resulting due to transitions than transversions, which brings out the change in protein structure and function.
Overall, we identified 20 genotypes (18 snap bean and 2 Lima bean) that consistently displayed high-to-moderate level of resistance to begomoviruses under field conditions. Further characterization and confirmation of resistance response should be conducted under controlled greenhouse conditions with standard parameters (exposure to standard or equal number of viruliferous whiteflies). Earlier studies deployed genotyping by sequencing (GBS) which resulted in reduced representation of genome and captured less genomic variants [18] or used much less frequently present simple sequence repeats (SSRs) [33]. The WGS approach captured an unprecedented number of genomic variants that can be used to identify the genetic basis of disease tolerance against the begomoviruses in Phaseolus species.

Conclusions
Our study reports the occurrence of CuLCrV and/or SiGMFV-induced symptoms in Phaseolus genotypes including 80 snap beans and two Lima bean genotypes. Based on our phenotyping experiments in field and genomics assisted studies we conclude that the tested genotypes depict significant variations in susceptibility against one and/or both viruses. Future comprehensive studies will be carried out with larger set of Phaseolus germplasm materials, which will aid in associating genetic diversity with diverse disease response against both the begomoviruses.
Initiative Award 2018-51181-28420 from the USDA National Institute of Food and Agriculture. The University of Georgia is an equal opportunity provider and employer

Author Contributions
BD conceived the project, GA performed the bioinformatics analyses. SRK and AS performed the phenotyping and DNA isolation of plant samples. SRK and SG analyzed the field phenotyping and whitefly data. GA and SRK complied the manuscript. BD, GA and RS contributed in planning and designing the experiment and manuscript revision. BD GA and RS designed and finalized the manuscript. BD planned the project, secured extramural funds, and revised and submitted manuscript.

Legends to supplementary figures and tables
Supplementary Figure S1 Supplementary Figure S2. Overview of raw data generated and data retained for after quality filtering of 82 lines of Phaseolus species for mapping and downstream analyses. Overall >97% of data was retained after quality filtering of raw data.
Supplementary Figure S3. Read mapping statistics of filtered data of 82 lines of Phaseolus species on to the reference genome of Phaseolus vulgaris. Fourteen out of the 82 Phaseolus lines showed <75% mapping.
Supplementary table S1: Summary of raw read data generated and amount of clean data obtained after filtering the raw data.