Allelic Diversity at Abiotic Stress Responsive Genes in Relationship to Ecological Drought Indices for Cultivated Tepary Bean, Phaseolus acutifolius A. Gray, and Its Wild Relatives

Some of the major impacts of climate change are expected in regions where drought stress is already an issue. Grain legumes are generally drought susceptible. However, tepary bean and its wild relatives within Phaseolus acutifolius or P. parvifolius are from arid areas between Mexico and the United States. Therefore, we hypothesize that these bean accessions have diversity signals indicative of adaptation to drought at key candidate genes such as: Asr2, Dreb2B, and ERECTA. By sequencing alleles of these genes and comparing to estimates of drought tolerance indices from climate data for the collection site of geo-referenced, tepary bean accessions, we determined the genotype x environmental association (GEA) of each gene. Diversity analysis found that cultivated and wild P. acutifolius were intermingled with var. tenuifolius and P. parvifolius, signifying that allele diversity was ample in the wild and cultivated clade over a broad sense (sensu lato) evaluation. Genes Dreb2B and ERECTA harbored signatures of directional selection, represented by six SNPs correlated with the environmental drought indices. This suggests that wild tepary bean is a reservoir of novel alleles at genes for drought tolerance, as expected for a species that originated in arid environments. Our study corroborated that candidate gene approach was effective for marker validation across a broad genetic base of wild tepary accessions.


Introduction
Identifying and characterizing novel sources of tolerance to abiotic stresses is among the most pressing requirements for coping with the effects of climate change on crop production [1]. Climate modeling forecasts that increased drought alone will jeopardize global crop production by over 10% sooner than 2050 [2], substantially worsening global malnutrition in the most vulnerable areas, which are also the poorest. Therefore, species and landraces locally adapted to dry environments [3] will prove keys to meet future demands. These exotic germplasm sources compared to those we use today, may either from the domestication bottlenecks in a cultivated genepool of tepary bean [48]. Since these regulators display signatures of adaptation in wild common bean from habitats with different water regimens, we hypothesized that tepary may exhibit diversity signals at these same genes indicative of adaptation to dry environments.
With the overall objective of understanding drought tolerance genes in tepary bean, our goals in this study were (1) to estimate drought tolerance in the hybridizing tepary bean clade s.l. (P. acutifolius-parvifolius) using geo-referenced germplasm accessions and associated climate information, and (2) to examine genetic correlations between estimated drought stress in tepary bean and its allele diversity at Asr2, Dreb2B, and ERECTA-encoding genes, which we had previously studied and found to be significantly associated with domestication and drought tolerance in common bean. This will allow the unlocking of drought-related genetic variation hidden in tepary beans and their wild relatives [11,49,50], extending from there into early landraces that together with variation from common bean wild relatives can be used to increase the rate of genetic gain for drought tolerance via inter-specific hybridization, marker-assisted backcrossing [51,52], genomic editing [53], or predictive breeding [54] in any of the Phaseolus cultigens.

Plant Material
A panel of diverse tepary bean genotypes representing the allelic variation in the P. acutifolius-parvifolius clade were considered in this study (23 wild P. acutifolius, 6 cultivated P. acutifolius, 4 P. acutifolius var. tenuifolius, and 19 P. parvifolius, Figure S1, Table S1). Wild accessions were prioritized over cultivated ones because many cultivars of tepary bean are duplicate or highly similar as indicated by Muñoz, Duque, Debouck, and Blair [16] and Blair, Pantoja, and Carmenza Munoz [9], using AFLP and SSR marker datasets, respectively. Besides, the hybridizing nature of the cultivated-wild tepary bean clade s.l. (P. acutifolius-parvifolius) implied that the effective size of this sampling was higher in terms of available standing adaptive variation and contrasted abiotic responses, allowing for a candidate gene design, as described below. Additional sequencing control genotypes were made up of the common bean accessions BAT93, BAT477, BAT881, and G19833. Seed material and recommendations were provided by D. Debouck and O. Toro of the Genetic Resource Unit and the Food and Agriculture Organization (FAO) Genebank collection, CIAT (http://isa.ciat.cgiar.org/urg/main.do, 10 April 2021).

Habitat-Based Drought Stress Indices
Available geo-referencing of the collection site for each accession was used to extract climate information at a 2.5-min resolution from WorldClim (http://www.worldclim.org, 10 April 2021) using the dismo and raster packages of the software package R v.4.0.2 (R Core Team). Historical temperature and precipitation values were obtained as monthly averages from 1970 to 2000 in order to estimate habitat Drought Index (DI) ratio (Table S2) using Thornthwaite's potential classic evapotranspiration (PET) model [55]. PET j computation followed Equation (1) for the 'j' month, as validated by Cortés, Monserrate, Ramírez-Villegas, Madriñán, and Blair [40], and López-Hernández and Cortés [56], in a diverse panel of common bean accessions. This PET j estimate considered explicit temperature effects (T j , monthly mean air temperature in • C), as well as indirect temperature-related properties via annual heat index (I, Equation (2)) and a cubic function of I (a, Equation (3)).
The PET j score in Equation (1) did not just account for temperature drivers, but also incorporated latitudinal adjusted sunlight radiation (L j , Equation (4)) as a function of day length duration (D j ) and latitude in sexagesimal degrees (Equation (5)).
Day length duration (D j ) was in turn corrected for day of year (J i , day 15 of the 'j' month), as in Equation (6)).
Finally, after computing PET j , monthly Drought Indices (DI j ) were obtained by comparing PET j estimators with monthly precipitation values (P j ), so that DI j followed Equation (7) for the 'j' month, where D j ≤ 100.
The PET j (Equation (1)) and DI j (Equation (7)) scores were considered over three, six, and 12 month intervals in the first place, then in an alternative analysis, where the estimations were carried out over four trimesters, with the aim to match tepary bean phenology at any time period over the year with drought stress indicators such as the ones described.
The timeframes considered accumulated (three, six, and 12 month periods) and nonaccumulated (trimesters one, two, three, and four) drought events over seasonal or yearlong time frames in the natural habitat where each accession was collected. A map of collection sites was drawn for the study area at 30 s resolution in R v.4.0.2 using ggmap package ( Figure 1). This geographical representation considered altitude in one panel and DI in the other showing each tepary bean collections site in parallel. The maps were useful to understand the overall DI, where red areas had higher stress values compared to those graphed in blue with lower drought stress and greater water availability, in comparison to altitude.

DNA Extraction, Candidate Gene Amplification and Sequencing
Total DNA was extracted from young leaves after seedling germination of each tepary bean genotype following the method of Dellaporta et al. [57]. Combinations of primers (Table S3) and amplification conditions of introns and flanking variable sections from three prioritized candidate genes followed standardizations carried out by Cortés, Chavarro, Madriñán, This, and Blair [35], Cortés, This, Chavarro, Madriñán, and Blair [41], and Blair, Cortés, and This [48] for Asr2, Dreb2B, and ERECTA-encoding genes, respectively. The same primers worked for amplicons' preparation and sequencing. Amplicons' quality and sizes were checked on a 1.5% agarose-Tris-Borate-EDTA gel containing GelRed (Biotium, Fremont, CA, USA). Successful amplicons with bright bands in the gels were purified using Exo-Sap clean-up reactions in order to be used as templates for subsequent paired-end Sanger sequencing reactions using BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, Waltham, MA, USA). Purified samples of the cleaned-up bands were aliquoted to a standard 200 ng of DNA, lyophilized, and sent to Macrogen USA (Psomagen Inc., Rockville, MD, USA) to be run on an ABI prism 3730 automated sequencer with original primers for sequencing and Big Dye chemistry (ABI, Foster City, CA, USA). Four common bean (P. vulgaris) genotypes (BAT93, BAT477, BAT881, and G19833) were included as sequencing controls to assist allele calling.

Patterns of Nucleotide Diversity and Environmental Correlations at Candidate Genes
Nucleotide alignments were carried out on Geneious v4.0 software (Biomatters Ltd., Auckland, New Zealand). The sequences were manually examined for quality of the alignment and single nucleotide polymorphism (SNP) call. Genetic diversity was explored per gene by computing PCA plots in DnaSP v5.10 [58], and Neighbor Joining (NJ) dendrograms with 1000 bootstrap iterations in Mega v4 [59]. The former software was also used to compute summary statistics of the site frequency spectrum. Pairwise F ST and median joining haplotype networks for each gene were calculated and drawn (F ST ranges were plotted per quartiles) in R packages poppr and pegas-ape, respectively (R Core Team).
Two strategic analyses were performed to identify the correlation between SNPs and environmental conditions of specific collection sites. Firstly, timeframes at three, six, and 12 months were considered. Secondly, timeframes included four trimesters along the year. The environmental indices were then correlated with all timeframes analyzed for the three genes. Environmental correlations with candidate genes implemented GLM, MLM, and CMLM models in Tassel v5.0 [60] for SNP markers, and MLM models for gene haplotypes in R's (v.3.4.4, R Core Team) package nlme [61]. SNP and haplotype-based mixed models, respectively, accounted for population strata [62] via an IBS kinship matrix as computed in Tassel v5.0 [60], as well as the first two principal genetic components consistent with previously generated SNP [23] and SSR [10] data. Since the candidate gene method had a priori bases on gene functionality and violated the random and independence allelic sampling hypotheses, a FDR equivalent p-value threshold of 0.01 was considered as in Oord and Sullivan [63].

Pervasive Environmental and Allelic Diversity for Three Drought Candidate Genes in Tepary
Bean s.l.
Drought stress was found to be extensive in tepary beans based on the climate data for the habitats where they were collected ( Figure 1). Wild tepary beans were from the foothills of the dry Sonoran desert mountains of northern Mexico with cultivars collected further afield as far south as Guatemala.
When observing the map below, we see that P acutifolius-parvifolius clade s.l. showed a correspondence to dry environments, with P. acutifolius and P. parvifolius prevalent in desert habitats at various altitudes. Distribution of the var. tenuifolius was in the northern area of the range, while wild var. acutifolius was from across the full range from Northern to Central Mexico. P. parvifolius was distributed in the middle part of the range with a few outliers from both extremes. The source of P. vulgaris were not illustrated here because they were not used in the climate analyses.
Following mapping and in order to study the variability of the drought stress indices based on taxonomic origin, we performed ANOVA and Tukey tests for the indices with normal behavior, and the pairwise Wilcox test for the non-normal indices corrected by Bonferroni ( Figure 2).  Following mapping and in order to study the variability of the drought stress indices based on taxonomic origin, we performed ANOVA and Tukey tests for the indices with normal behavior, and the pairwise Wilcox test for the non-normal indices corrected by Bonferroni ( Figure 2).
Scaled habitat drought stress indices were equivalent for different timeframes, as shown in Figure 2 with boxplots for the three (median = 0.14, CI 95%: 0.13-1.   (Table S1) as follows: Green and blue for cultivated and wild P. acutifolius, red for wild P. acutifolius var. tenuifolius, and purple for wild P. parvifolius. Altitudes represented in various tones of achromatic gray. Drought severity index based on Thornthwaite's potential evapotranspiration (PET) model (DI-Thornthwaite's index) indicated by scales of red (most intense), orange, yellow to green, and blue (least intense). Latitude and longitude represented by grids in both panels. Wild and cultivated accessions marked as filled triangles ( ) and squares ( ), respectively.  (Table S1) as follows: Green and blue for cultivated and wild P. acutifolius, red for wild P. acutifolius var. tenuifolius, and purple for wild P. parvifolius. Altitudes represented in various tones of achromatic gray. Drought severity index based on Thornthwaite's potential evapotranspiration (PET) model (DI-Thornthwaite's index) indicated by scales of red (most intense), orange, yellow to green, and blue (least intense). Latitude and longitude represented by grids in both panels. Wild and cultivated accessions marked as filled triangles (▴) , respectively.
Following mapping and in order to study the variability of the drought stress indices based on taxonomic origin, we performed ANOVA and Tukey tests for the indices with normal behavior, and the pairwise Wilcox test for the non-normal indices corrected by Bonferroni ( Figure 2).
Scaled habitat drought stress indices were equivalent for different timeframes, as shown in Figure 2 with boxplots for the three (median = 0.14, CI 95%: 0.13-1.  acutifolius and wild P. acutifolius, red for wild P. acutifolius var. tenuifolius, and purple for wild P. parvifolius.
Scaled habitat drought stress indices were equivalent for different timeframes, as shown in Figure 2 with boxplots for the three (median = 0.14, CI 95%: 0.13-1.

Allele Variants at Dreb2B and ERECTA-Encoding Candidate Genes Were Correlated with Drought Stress
Single nucleotide polymorphism alleles and frequencies along with expected heterozygosity (H e ) are found in Table 1, with 13 SNPs in Asr2, 8 in Dreb2B, and 22 in ERECTA. The difference in SNP number for each gene was probably related to the length of the fragment analyzed by sequencing or the nature of polymorphism in the genes. Grouping of accessions by PCA clustering was minimal based on overall genetic polymorphism at Asr2 (Figure 3A), Dreb2B ( Figure 3B), and ERECTA-encoding ( Figure 3C), respectively. Reconstruction of Neighbor Joining (NJ) dendrograms supported this observation ( Figures S2-S4, respectively for the same genes). Genotypes of different taxonomic origins for drought tolerance were generally intermingled (Table S4). More concretely, cultivated P. acutifolius was not necessarily more clustered than wild P. acutifolius (i.e., Asr2, Figure 3A), and these were not assembled separately from P. acutifolius var. tenuifolius and P. parvifolius. morphism at Asr2 (Figure 3A), Dreb2B ( Figure 3B), and ERECTA-encoding ( Figure 3C), respectively. Reconstruction of Neighbor Joining (NJ) dendrograms supported this observation ( Figures S2-S4, respectively for the same genes). Genotypes of different taxonomic origins for drought tolerance were generally intermingled (Table S4). More concretely, cultivated P. acutifolius was not necessarily more clustered than wild P. acutifolius (i.e., Asr2, Figure  3A), and these were not assembled separately from P. acutifolius var. tenuifolius and P. parvifolius.  (Table S1): Green and blue for cultivated and wild P. acutifolius, red for P. acutifolius var. tenuifolius, and purple for P. parvifolius, latter two wild entries. First two principal genetic components allow comparisons with previously generated SNP [23] and SSR [10] data. Summary statistics calculated for the site frequency spectrum for each candidate gene in tepary bean s.l. revealed contrasting demographic/selection patterns ( Table 2). In this analysis, Asr2 matched the expectations of a semi-structured pairwise mismatch distribution (positive Tajima's D value of 0.873). Meanwhile, for Dreb2B and ERECTA-encoding genes, signatures of directional/purifying selection (negative Tajima's D values of −0.814 and −0.974, correspondingly) were observed, likely in favor of adaptive alleles selectively advantageous. Table 2. Summary statistics of the site frequency spectrum at Asr2, Dreb2B, and ERECTA-encoding candidate genes for drought tolerance in tepary bean s.l. (P. acutifolius-parvifolius clade). Depicted summary statistics include S: Number of polymorphic (segregating) sites (enforced maf > 0.05 for the entire dataset), maf: Average minimum allele frequency, He: average expected heterozygosity-as a measure of the polymorphism information content (PIC), π: Nucleotide diversity [64], θW: Theta of Watterson-per site from S [65], and Tajima's D [66]. Only variable taxa with enough sampling to make per-population computations reliable are kept.    (Table S1): Green and blue for cultivated and wild P. acutifolius, red for P. acutifolius var. tenuifolius, and purple for P. parvifolius, latter two wild entries. First two principal genetic components allow comparisons with previously generated SNP [23] and SSR [10] data.

Gene
Summary statistics calculated for the site frequency spectrum for each candidate gene in tepary bean s.l. revealed contrasting demographic/selection patterns ( Table 2). In this analysis, Asr2 matched the expectations of a semi-structured pairwise mismatch distribution (positive Tajima's D value of 0.873). Meanwhile, for Dreb2B and ERECTAencoding genes, signatures of directional/purifying selection (negative Tajima's D values of −0.814 and −0.974, correspondingly) were observed, likely in favor of adaptive alleles selectively advantageous. Table 2. Summary statistics of the site frequency spectrum at Asr2, Dreb2B, and ERECTA-encoding candidate genes for drought tolerance in tepary bean s.l. (P. acutifolius-parvifolius clade). Depicted summary statistics include S: Number of polymorphic (segregating) sites (enforced maf > 0.05 for the entire dataset), maf : Average minimum allele frequency, H e : average expected heterozygosity-as a measure of the polymorphism information content (PIC), π: Nucleotide diversity [64], θ W : Theta of Watterson-per site from S [65], and Tajima's D [66]. Only variable taxa with enough sampling to make per-population computations reliable are kept. In a different analysis, boxplots of the pairwise F ST values were prepared (Figure 4) for each candidate gene across the taxonomic origins to search for significant differences between Asr2, Dreb2B, and ERECTA. Differences were not observed for the first and last of these genes, but a cultivated vs. wild P. acutifolius difference was observed for Dreb2B.

Gene
In a different analysis, boxplots of the pairwise FST values were prepared (Figure 4) for each candidate gene across the taxonomic origins to search for significant differences between Asr2, Dreb2B, and ERECTA. Differences were not observed for the first and last of these genes, but a cultivated vs. wild P. acutifolius difference was observed for Dreb2B. Haplotype network reconstructions were made for Asr2 ( Figure 5A), Dreb2B ( Figure  5B), and ERECTA-encoding ( Figure 5C) candidate genes. For the latter, all cultivated tepary beans shared a single haplotype, while the wild tepary beans from within P. acutifolius and from P. parvifolius were in various nodes of the networks.  acutifolius-parvifolius clade). Nodes represent haplotypes, its size relative to its frequency. Marks above each segment are substitutions. Nodes are colored by taxonomy (Table S1): Green and blue for cultivated and wild P. acutifolius, red for P. acutifolius var. tenuifolius, and purple for P. parvifolius, the latter two wild accessions. Haplotype network reconstructions were made for Asr2 ( Figure 5A), Dreb2B ( Figure 5B), and ERECTA-encoding ( Figure 5C) candidate genes. For the latter, all cultivated tepary beans shared a single haplotype, while the wild tepary beans from within P. acutifolius and from P. parvifolius were in various nodes of the networks. In a different analysis, boxplots of the pairwise FST values were prepared (Figure 4) for each candidate gene across the taxonomic origins to search for significant differences between Asr2, Dreb2B, and ERECTA. Differences were not observed for the first and last of these genes, but a cultivated vs. wild P. acutifolius difference was observed for Dreb2B. Haplotype network reconstructions were made for Asr2 ( Figure 5A), Dreb2B ( Figure  5B), and ERECTA-encoding ( Figure 5C) candidate genes. For the latter, all cultivated tepary beans shared a single haplotype, while the wild tepary beans from within P. acutifolius and from P. parvifolius were in various nodes of the networks.  acutifolius-parvifolius clade). Nodes represent haplotypes, its size relative to its frequency. Marks above each segment are substitutions. Nodes are colored by taxonomy (Table S1): Green and blue for cultivated and wild P. acutifolius, red for P. acutifolius var. tenuifolius, and purple for P. parvifolius, the latter two wild accessions. , and (C) ERECTA-encoding candidate genes for drought tolerance in tepary bean s.l. (P. acutifolius-parvifolius clade). Nodes represent haplotypes, its size relative to its frequency. Marks above each segment are substitutions. Nodes are colored by taxonomy (Table S1): Green and blue for cultivated and wild P. acutifolius, red for P. acutifolius var. tenuifolius, and purple for P. parvifolius, the latter two wild accessions.
In the test of habitat drought stress correlation with alleles at the three genes, results were consistent across several model types, yet CMLM was stricter than GLM and MLM (Table 3). For instance, GLM showed most of the correlations at Dreb2B, being 87.5% of the segregating sites correlated under this model for both timeframes. At ERECTA, the correlated SNPs accounted for 27%. MLM, on the other hand, was less prevalent at any timeframe, and CMLM showed no significance to any SNP, likely due to an overcorrection or inflated type β error. Significant associations were lost when carrying out haplotype-based environmental correlations. Table 3. Habitat drought stress significant correlations with SNP polymorphisms at Asr2, Dreb2B, and ERECTA-encoding candidate genes for drought tolerance in tepary bean s.l. (P. acutifolius-parvifolius clade). Habitat drought stress (Table S2) was computed as a drought index (DI) using Thornthwaite's evapotranspiration model at three, six, and 12 months as well as trimester timeframes (T1 to T4). Environmental correlations with each candidate gene were determined using generalized, mixed, and compressed mixed (GLM, MLM, and CMLM) linear models accounting for population stratification via an IBS kinship matrix as implemented in Tassel v5.0. Only significant associations are depicted; their p-value estimates are bolded (p-value < 0.05) and underlined (p-value ≤ 0.01), the latter a FDR equivalent for a candidate gene design [63]. SNPs in coding regions are in bold, and non-synonymous underlined. Tepary bean is unique compared to common bean in being monophyletic, having a single genepool, and furthermore, low introgression from other species due to its highly autogamous inbreeding status. This is reflected in the results of the candidate gene analysis here for wild and cultivated tepary accessions. Genes Asr2, Dreb2B, and ERECTA harbored signatures of directional/purifying selection, in favor of adaptive alleles represented by various SNPs significantly correlated with the environmental drought indices (p-value < 0.05 and 0.01). Network analysis also found haplotypes more frequent for wild accessions than cultivars. These results suggest that tepary bean s.l., especially wild species P. parvifolius and wild accessions of P. acutifolius var. acutifolius or var. tenuifolius, were reservoirs of novel alleles at candidate genes for tolerance, as expected for a drought-tolerant species that originated in arid environments. Overall diversity is thought to increase consecutively from low levels in cultivated P. acutifolius, to intermediate levels in wild P. acutifolius var. acutifolius or var. tenuifolius, to higher levels in P. parvifolius. For this reason, sampling of larger amounts of wild accessions was important and representative of the diversity found in previous characterizations [9,16].

SNP GLM p-Value
Allele diversity in Asr2, Dreb2B, and ERECTA-encoding candidate genes was recovered in tepary bean s.l. (including all of those in the primary and secondary genepool, namely the variants which are all inter-crossable, and the wild species P. parvifolius, which can also be crossed with no barrier to P. acutifolius). This is appealing because variation of cultivated tepary bean is limited by strong genetic bottlenecks, given the paucity of diversity in seed color or SSR marker genotypes in cultivated accessions [9].
The observation above of diversity signals indicative of adaptation to drought at these candidate genes brings us back to the hypothesis for this research and questions of whether wild tepary beans and P. acutifolius var. tenuifolius and P. parvifolius are more variable for drought gene alleles than cultivated types. Although we cannot answer the question definitively, it appears that drought tolerance candidate genes were not as severely affected by domestication as suggested by the limited allele diversity found for SSR markers or few seed colors in the crop [48]. The main reason for this might be that drought tolerance was likely an "untouched" trait during the domestication of tepary bean, being a drought-tolerant species complex from very arid environments. Increases in functional diversity-but not in expression diversity [67], had already been noticed at target traits during the domestication of common bean as revealed by genome-wide enrichment of non-synonymous substitutions [68], and may also be responsible for adaptive functions, beyond drought tolerance, in tepary bean.
Analyses of the same candidate genes for drought tolerance reveal contrasting patterns in the closely related common bean species. Signatures of directional and divergent selection are observed in Asr1 [35] and Dreb2A [40] for genes in wild accessions from semimesic to dry habitats, and an ERECTA-encoding gene exhibits haplotype correlations with ecological differences in diverse common beans [48]. Additionally, when these candidate genes were screened in cultivated genepools and races of common bean, diversity signals indicative of adaptation to drought are in line with expectations concerning selection during the domestication syndrome or simply reflect demographic bottlenecks.

Unlocking Useful Genetic Variation for Drought Tolerance
Since genetic variation at drought candidate genes is partly recovered for tepary bean, this polymorphism likely predates the domestication event. Still, wild tepary bean, especially from P. acutifolius var. acutifolius and var. tenuifolius with no genetic barriers to the cultivated types, might be useful to add new variability for cultivars. P. parvifolius has also been shown to be not too distant from tepary beans [9] and crossability should be tested with multiple accessions. The domestication of tepary bean in the xeric region of the Sonoran Desert, along mountain arroyos and up to the Mexico-USA border could have reinforced allele variation at drought genes in favor of tolerant phenotypes. This appears to hold true for both var. tenuifolius and var. acutifolius in the wild accessions. Inter-crossing would be useful with a panel of diverse, multi-colored tepary beans with multiple seed sizes since most cultivars are white seeded [17], but some cream, yellow and tan to brown colored cultivated accessions exist [16]. Among our wild accessions, most had the typical wild tepary bean patterned seed, which are mottled, small in size, angular in shape, and gray to black in seed color.
Drought tolerance has traditionally not been a trait under direct selection as part of the domestication syndromes of Phaseolus bean species [12]. Seed germination, color and size, or pod size and dehiscence plus flowering time [69], Rhizobia symbiosis [70,71], secondary metabolites [72], and circadian clock components [73] are all functions more commonly regulated during the domestication process of legumes. Still, the nature of the strength and the direction of the selection have varied across domestications in Phaseolus. This is the case for common bean with two or three domestication events [74], and lima bean also with multiple possible domestications [75].
However, since tepary bean is a drought-tolerant species that originated in droughtprone environments only once, drought tolerance may have preceded the domestication syndrome, and could have been a trait unaffected by domestication despite strong population bottlenecks of a single domestication of light colored seeded types [9]. Unlocking newer genetic variation for drought tolerance in tepary bean can harness further resequencing of allelic diversity and marker validation in a broader basis of germplasm material, wild accessions, and related species, within the limitations of the few accessions collected in gene banks.
Returning to the allele diversity found among Phaseolus, abiotic stress responsive candidate genes exhibit comparable patterns of diversity when contrasting orthologous sequences across species. For instance, the common bean's Asr2 matches the bimodal expectations of a pairwise folded site frequency spectrum [35] inferred using neutral reference loci [76], while Dreb2B shows signatures of purifying selection [40]. This contrasts with the double domestication of common bean [77]. Dreb2B and ERECTA-encoding orthologous genes display analogous selective signatures in tepary bean s.l., as supported by Tajima's D summary statistic, despite P. vulgaris and P. acutifolius-parvifolius not being sister clades [12,78]. Parallel studies might be done in P. lunatus, an outlier among the Phaseolus or in related species as Mesoamerican P. vulgaris, P. dumosus (year-long), and P. coccineus (scarlet runner bean).
Furthermore, full sequencing of linked and well-understood genic regions, e.g., [79], as compared to "random" discovery of SNP markers in linkage equilibrium [80,81], allows for a more precise application of analytical tools, targeting adaptation in wild [82] and semidomesticated materials [83]. Techniques such as gene-based species tree reconstruction [84] and inferences of the mutation/selection balance [85,86] presuppose physical linkage among markers. As perspective for oncoming studies, we recommend leveraging [87] tools capable of discerning among genuine signatures of adaptive selection to drought from those due to spurious effects [88] related to the demography of the domestication bottlenecks [69,89,90]. Phylogeographic inferences, e.g., [91] will also be improved by assuming independent gene mutation models [92], yet unlinked marker inferences are already available [93][94][95]. Finally, harnessing polygenic adaptive scores from gene-based and genome-wide prediction models [96][97][98] will help building a more cohesive picture of natural drought adaptation and vulnerability [99] in the face of climate change [100], with a focus on highly heterogeneous mountain geographies [101].
As the candidate gene approach bypasses the theoretical limitations of using biallelic genetic markers, it is still an efficient and cost-effective methodological alternative, e.g., [102], especially at advanced stages in the genetic mapping of key traits or for marker validation; and in "orphan" species short either on funding or genetic knowledge base. The three genes we analyzed showed significant SNPs (p-value < 0.05 and 0.01, the latter a FDR equivalent [62] for a candidate gene [63] design) given habitat drought stress DI (at three, six, and 12 months) in one, five, and four segregating sites for Asr2, Dreb2B, and ERECTA-encoding genes, respectively considering semester timeframes as well, supporting the hypothesis that the genes are key for drought tolerance in tepary beans. The utility of genotype x environment adaptation variables based on geo-references collection sites for the study of candidate genes in wild and cultivated accessions is demonstrated by our study.
For well-studied traits with relatively conserved pathways, such as drought tolerance, the candidate gene approach is still informative despite de novo high-throughput genomewide genotyping [103][104][105]. While reduced genome complexity genotyping techniques, like genotyping-by-sequencing [106], perform well in medium-size panels, the candidate gene approach allows deeper genotyping of fewer gene regions [107,108] for traits influenced by variants with intermediate effects, e.g., [109,110].

Conclusions
In summary, our candidate gene study allowed us to delve into the signatures of directional/purifying selection, in favor of adaptive alleles, or the frequency of haplotypes among taxonomic groups or correlated with the environmental drought indices. The results suggested that tepary bean s.l., especially wild accessions, could be sources of novel alleles for breeding of further drought tolerance in the cultivated accessions of an already drought-tolerant orphan crop. Cultivated tepary bean, being low in diversity [9], would benefit from wide crosses with wild relatives to obtain new traits such as new leaf shapes and new seed colors while maintaining the drought tolerance that it innately possesses.