Population Genetic Structure and Diversity of Cryptic Species of the Plant Genus Macrocarpaea (Gentianaceae) from the Tropical Andes

The Pleistocene climatic oscillations (PCO) that provoked several cycles of glacial–interglacial periods are thought to have profoundly affected species distribution, richness and diversity around the world. While the effect of the PCO on population dynamics at temperate latitudes is well known, considerable questions remain about its impact on the biodiversity of neotropical mountains. Here, we use amplified fragment length polymorphism molecular markers (AFLPs) to investigate the phylogeography and genetic structure of 13 plant species belonging to the gentian genus Macrocarpaea (Gentianaceae) in the tropical Andes. These woody herbs, shrubs or small trees show complex and potentially reticulated relationships, including cryptic species. We show that populations of M. xerantifulva in the dry system of the Rio Marañón in northern Peru have lower levels of genetic diversity compared to other sampled species. We suggest that this is due to a recent demographic bottleneck resulting from the contraction of the montane wet forests into refugia because of the expansion of the dry system into the valley during the glacial cycles of the PCO. This may imply that the ecosystems of different valleys of the Andes might have responded differently to the PCO.


Introduction
Mountain ranges are often considered as speciation centers that significantly contribute to enhancing species diversity on Earth [1,2]. They are thought to favor speciation through their high overall physiographic heterogeneity that facilitates both allopatric divergence between populations occurring on different montane systems and adaptive divergence along elevational gradients within the montane systems [3,4]. Numerous studies have documented rapid species divergences (radiations) of taxa taking place almost immediately after the colonization of a particular montane region [5][6][7][8]. While the evidence for the contribution of adaptive divergence remains insignificant [9], it is clear that allopatric speciation plays a determinant role in driving many montane radiations [2,10].
Climate change has also been proposed as a potentially important driver of speciation in montane regions, particularly in non-vagile organisms such as plants [11,12]. It is relatively well justified that the Pleistocene climatic oscillations (PCO, 2.58-0.01 mya) have profoundly affected the demographic evolution of montane elements by repeatedly provoking species movement on the elevational gradients: descending during the glacial periods and ascending during interglacial periods [11]. At temperate latitudes, the glacial periods of the PCO are generally considered as a time of range expansion of montane cold-adapted taxa and their secondary contact in the lowlands [11,13,14]; however, examples of divergence or at least isolation in peripheral refugia valleys have also been documented [15,16]. In contrast, the effect of the PCO glacial-interglacial cycles on neotropical montane biodiversity is still understudied.
Among neotropical mountains, the tropical Andes are of particular interest because they have exceptional levels of diversity and endemism [17]. It has been proposed that the

Population Structure
The ∆K statistic of the STRUCTURE analyses performed on all the genotypes of all specimens sampled in the study supported an optimal value of K = 2 ( Figure S1A). Here, we denominate the clusters as A and B, with cluster A (black) being composed of populations of M. claireae and M. xerantifulva and cluster B (white) being composed of the populations of all remaining species (see Material and Methods). Levels of admixture between these two clusters were generally very low ( Figure 1I). Contrary to ∆K, estimates of mean loglikelihood (mean L(K)) increased steadily after K = 2 before reaching a plateau at K = 8. This discrepancy between the two statistics indicates that the ∆K statistic identified the highest level of hierarchical structure in our dataset but that additional layers of the genetic structure are likely present inside the clusters.
In the subsequent analyses performed inside cluster B, both ∆K and mean L(K) favored K = 4 ( Figure S1B). The bar plot shows the probability of assignment of individuals to a particular genetic cluster, with the individuals sorted according to increasing distance from the northernmost individual (north: left, south: right, Figures 2B and 1II). Two genetic groups are dominant. One, labeled as A1 and shown as sky blue in the north, is highly dominant in the populations of M. claireae 1 and M. claireae 2, while A3 (red) is found as dominant among the central and southern populations. In between, cluster A2 (yellow) is often found in central populations, showing mixed genetic information with A1 and A3. Its prevalence tends to decrease toward the south and it is absent from the two northernmost populations (A1). The populations with the highest prevalence of A2 are M. xerantifulva 19 and M. xerantifulva 2 but both are formed by only two individuals. Finally, the A4 (violet) cluster is relatively rare but is present along the entire geographic distribution. The only population where all individuals show this characteristic is M. xerantifulva 9. Overall, the level of admixture in individuals is low, except in Peruvian populations close to the border with Ecuador (i.e., "central populations"). Inside cluster B, the ∆K statistic favored K = 3 while the mean L(K) only reached a plateau for K = 6 ( Figure S1C). This indicates that several layers of the genetic structure might remain inside the identified clusters. All populations of M. dies-viridis fall in cluster B3 (blue). The cluster B1 (green) regroups  Table 1) are indicated below the bars and are formed by the first 4 characters o name.  Table 1) are indicated below the bars and are formed by the first 4 characters of the species name. populations close to the border with Ecuador (i.e., "central populations"). Inside cluster B, the ∆K statistic favored K = 3 while the mean L(K) only reached a plateau for K = 6 ( Figure S1C). This indicates that several layers of the genetic structure might remain inside the identified clusters. All populations of M. dies-viridis fall in cluster B3 (blue). The cluster B1 (green) regroups populations from eight different species and covers a very large geographic distribution (Figures 2A and 1III). The level of admixture in individuals is very low except for a few individuals from the three populations of M. quizhpei.  Table 1), the black to the sub-cluster B2 (M. pacifica, lenae, quizhpei) and the dark blue to the sub-cluster B3 (M. dies-viridis). Triangles indicate the populations of cluster A. The magenta triangles correspond to the sub-cluster A1 (M. claireae), the yellow to A2 and the red to A3 (M.  Table 1), the black to the sub-cluster B2 (M. pacifica, lenae, quizhpei) and the dark blue to the sub-cluster B3 (M. dies-viridis). Triangles indicate the populations of cluster A. The magenta triangles correspond to the sub-cluster A1 (M. claireae), the yellow to A2 and the red to A3 (M. xerantifulva). (B) Zoom on the Rio Chinchipe valley and the distribution of cluster A (M. claireae and xerantifulva), including the sub-cluster A4 (light blue and asterisk) not visible in (A).

Descriptive Statistics of Diversity
Gene diversity (H j , see Table 1) is almost two times higher in the populations of cluster B (mean H j = 0.150 ; sd = 0.025) than in those of cluster A (mean H j = 0.089; sd = 0.026) and the difference in means between the two clusters is highly significant (T (34 . 5) = 7.076, p < 0.001). The results of the hierarchical AMOVAs performed either by grouping populations by species or by their prevailing assignment with Bayesian clustering (Table 2) revealed that, in both cases, more than half of the total genetic variation (54.88% and 57.07%, respectively) can be explained by the variation between individuals inside the populations. In both designs, as expected based on the clusters being formed by several species, the proportion of the variation explained by differences among groups (species or cluster) is two times higher than the variation explained by the difference among populations within groups. This indicates that the groups' boundaries explain a substantial proportion of the genetic variation. However, the proportion of the variation explained by the species (33.06%) is higher than the proportion of the variation explained by STRUCTURE genetic clusters (27.25%), suggesting that the species; boundaries better capture the overall genetic variation. In both designs, all types of F statistics tested with permutations were highly significant ( Table 2). The fact that F SC statistics (variation among sub-groups divided by the sum of variation among sub-groups and variation within sub-groups) are significant indicates that populations inside the groups are still appreciably differentiated.
The NJ dendrogram computed from Nei's genetic distance ( Figure 3) displays the same basal dichotomy as inferred with STRUCTURE. Cluster A has relatively high bootstrap support (849/1000) and the population M. xerantifulva 9, which is the only population predominantly assigned to the sub-cluster A4, is well supported as being well differentiated within the cluster. Then, inside cluster A, populations do not group according to their STRUCTURE assignment. Similarly, internodes are short and bootstrap support is low, indicating potentially little divergence among populations. Inside cluster B, the bootstrap support is in general higher and the internodes are much longer than in cluster A, very likely indicating a larger divergence between populations inside cluster B. The only subcluster identified by STRUCTURE that is strongly supported as monophyletic is cluster B3. M. dies-viridis (893/1000), M. illuminata (845/1000) and M. quizhpei are strongly supported as monophyletic (920/1000). Table 2. Summary of the hierarchical analyses of molecular variance (AMOVA) performed in Arlequin by grouping: (1) the populations by species and (2) by genetic clusters identified by the two-step STRUCTURE analyses. p-values were obtained with 1023 permutations. F CT = variation among groups divided by total variation, F SC = variation among sub-groups within groups divided by the sum of variation among sub-groups within groups and variation within sub-groups, F ST = the sum of variation among groups and variation among sub-groups within groups, divided by total variation.

Spatial Genetic Structure
The correlograms showing the spatial autocorrelation of kinship (Fij), measured for cluster A ( Figure 4A) and cluster B ( Figure 4B) independently, are similar in the way they reveal significantly higher kinship than expected by chance across a relatively large distance. For cluster A, the kinship becomes null only between 6.5 and 33 km. For cluster B, the kinship becomes null between approximately 6 and 36 km. The slope of the regression of kinship against ln of distance is significantly more negative than expected under a scenario of the absence of spatial genetic structure both for cluster A (bF = −0.0147; p < 0.01) and cluster B (bF = −0.0148; p < 0.01). The two clusters have relatively similar Sp statistics with Sp = 0.0158 and Sp = 0.0172 for cluster A and cluster B, respectively. Finally, it should be noted that the intra-individual kinship (F(0)), which corresponds to the FIT when individuals from different populations are compared [46], is much higher for the cluster B (F(0) = 0.22) than for the cluster A (F(0) = 0.13). This indicates that the sum of inbreeding inside the populations and the overall differentiation between populations is higher in group B. As gene diversity is higher in the populations of cluster B (see above and Table 2), the

Spatial Genetic Structure
The correlograms showing the spatial autocorrelation of kinship (F ij ), measured for cluster A ( Figure 4A) and cluster B ( Figure 4B) independently, are similar in the way they reveal significantly higher kinship than expected by chance across a relatively large distance. For cluster A, the kinship becomes null only between 6.5 and 33 km. For cluster B, the kinship becomes null between approximately 6 and 36 km. The slope of the regression of kinship against ln of distance is significantly more negative than expected under a scenario of the absence of spatial genetic structure both for cluster A (b F = −0.0147; p < 0.01) and cluster B (b F = −0.0148; p < 0.01). The two clusters have relatively similar Sp statistics with Sp = 0.0158 and Sp = 0.0172 for cluster A and cluster B, respectively. Finally, it should be noted that the intra-individual kinship (F (0) ), which corresponds to the F IT when individuals from different populations are compared [46], is much higher for the cluster B (F (0) = 0.22) than for the cluster A (F (0) = 0.13). This indicates that the sum of inbreeding inside the populations and the overall differentiation between populations is higher in group B. As gene diversity is higher in the populations of cluster B (see above and Table 2), the difference is likely a consequence of higher differentiation between populations in cluster B rather than higher inbreeding.

Discussion
Here    Interestingly, one of the rare studies that addressed the phylogeography of a montane plant group from the same region identified the Rio Chinchipe valley as a source population of Ceroxylon echinulatum Galeano (Arecaceae) for the colonization of the eastern and western cordilleras of the Andes followed by subsequent migration toward the north during the Quaternary [47]. Here, instead, we propose that cluster A likely remained separated in this valley, and the low level of admixture detected between clusters A and B indicates that they have evolved in relative isolation (spatial and/or genetic) from one another. Naturally, this isolation theory needs to be supported by a closer inspection of the valleys surrounding the Rio Chinchipe valley, but it should be noted that, despite several visits on the eastern side of the Rio Chinchipe valley (populations were sampled on the western side of the river), we were not able to find populations from cluster A, while other Macrocarpaea species from different clades were collected at a higher elevation. Visits further south, on the other side of the Rio Marañón, were also unfruitful.

The Concept of Species Inside the Species Complex
Several elements from our results raise important questions regarding the definition of the species boundaries inside this Macrocarpaea species complex. Notably, neither the genetic distance-based dendrogram nor the Bayesian clustering analyses support the species boundary defined based on morphological data and field observations for M. claireae and M. xerantifulva. Macrocarpaea claireae in cluster A is relatively deeply nested in the middle of the populations of M. xerantifulva on the dendrogram. The Bayesian clustering analyses reveal that M. claireae 1 and the northernmost population of M. claireae 2, which were sampled a few kilometers apart from one another (<3 km), are members of the same genetic cluster (cluster A1). The fact that individuals belonging to cluster A1 are also found mixed with other genetic clusters in other populations close to the border between Ecuador and Peru indicates that M. claireae and M. xerantifulva may be the same species that displays a geographic genetic structure. The relatively low branch length at the internodes for clade A on the dendrogram also suggests relatively low genetic differentiation between populations inside the cluster, which support the hypothesis of the presence of a single species inside cluster A. However, as the samples of the populations of cluster A are very small, the structure detected within this cluster is probably not sufficiently strong from a statistical point of view and the species boundaries need to be tested with more solid approaches. Thus, future studies should revisit the taxonomic status of M. claireae and M. xerantifulva.
A population genetic approach like we employ in this study has great potential to first detect the genetic boundaries of cryptic species and then to assist the taxonomic assignment of individuals to one species or another. Surprisingly, the ∆K statistics [48] only favored 3 genetic groups inside cluster B where we recognized 10 taxonomic species [28,35]. We think that this small number of groups identified is very likely the outcome of the detection of a deep genetic structure inside cluster B, rather than the presence of a maximum of three differentiated genetic groups [49]. This is illustrated by the low level of admixture detected inside each sub-cluster ( Figure 1III). In comparison, the mean log-likelihood for the number of clusters (mean L(K); Figure S1C) only started to reach a plateau around K = 6 and thus indicates the potential presence of additional levels of genetic structure. The comparison of the partitioning of the genetic variance (AMOVA, Table 2), either assuming the species boundaries or the genetic clusters as the highest level of genetic structure, revealed that the species boundaries explained a greater proportion of the genetic variance, which also supports the idea that the genetic clusters identified by STRUCTURE did not capture the totality of the genetic structure present in our dataset.

Differences in Gene Diversity and the Dry Refugia
We found a significant deficit of about 40% in gene diversity in the populations of M. xerantifulva (mean H j = 0.089) in comparison to the populations from cluster B (mean H j = 0.150). Compared with estimates in the literature for other perennial and predominantly outcrossing plants [50], gene diversity is relatively low for cluster B and very low for M. xerantifulva. The very low levels of gene diversity found in M. xerantifulva indicate that this species very likely went through a demographic bottleneck, which fits the prediction of the dry refugia model [22]. This, in turn, suggests that the Rio Chinchipe valley might have been drier during the cold cycles of the PCO than nowadays and that the populations of M. xerantifulva re-colonized the valley only recently. An alternative hypothesis to explain the low genetic diversity in M. xerantifulva could be that it has recently been founded by a small number of individuals (founder event; [51]). This hypothesis is quickly refuted by the fact that the basal divergence between M. xerantifulva and cluster B likely dates back to their common ancestor some 2 Myr [45]. According to the prediction of the dry refugia hypothesis, it should be possible to locate the position of the refugium by identifying the population or the region that has the highest level of gene diversity. In M. xerantifulva there is no clear candidate population that differs from the others. The only exception is M. xerantifulva 12, but it is a very small population (N = 3, H j = 0.132) which limits its significance. Another approach to identifying refugia could consist of assessing the diversity of genetic clusters inside populations (genotype diversity). Using this approach, M. xerantifulva 18, which is in northern Peru in the proximity to the border with Ecuador, and M. xerantifulva 18, which is located about 20 km further south, are the best candidates. Considering that if an intensification of drought occurred in the Rio Chinchipe valley, it should have been more severe in the south, the position of refugia in that part of the valley is not illogical. Nevertheless, this pattern of genotype richness cannot be taken as the ultimate evidence for the position of refugia. Secondary contact zones can also generate a hotspot of genotype diversity (genetic cline) and both scenarios are relatively difficult to set apart [52].

Spatial Genetic Structure
Analyses of the spatial genetic structure (SGS) revealed a strikingly similar pattern of SGS between M. xerantifulva and genetic cluster B. Both groups show significantly high kinship over a relatively long distance (6-30 km) and thus have a relatively weak SGS. The regression slopes of the relationship between pairwise kinships and pairwise logarithms of distances (b F ) were significantly more negative than expected by chance, which indicates a significant effect of isolation by distance. However, this last assertion should be taken with caution as the maximal distances we considered were very large in comparison with other studies (124 km for M. xerantifulva and 847 km for cluster B). Therefore, it is not surprising to detect an effect of isolation by distance at such a large spatial scale. In addition, despite their similarity, the SGS estimates for cluster B are likely to be meaningless as this group is potentially composed of several species while SGS is usually estimated at the intraspecific level [53]. Consequently, we limit the interpretation of SGS to M. xerantifulva. The significantly high kinship we found at a relatively large distance in M. xerantifulva suggests that the populations of this species are relatively well connected genetically. Macrocarpaea species are usually predominantly pollinated by nectar-feeding bats [28,54] which have a foraging radius of 30 to 50 km and thus can disperse pollen across considerable distances [55]. It is therefore possible that the high population connectivity we detect is enhanced by bat pollination. Finally, the Sp statistic for M. xerantifulva (Sp = 0.0158) is extremely similar to the rare estimate that exists in the literature for bat-pollinated plant species [56,57].

Limitations and Alternative Explanations
We recognize that our study and the interpretation of the results have limitations and that there may be alternative explanations to our observations and conclusions. One of the limitations is the uneven sampling due to the extremely difficult accessibility and possibilities of finding of the populations and sample size. For example, some of the populations in cluster A, compared to cluster B, include only a few representatives, which might have biased the result showing low genetic variance, i.e., a low genetic difference observed among few samples may be low by chance and would be different if more individuals were sampled. However, the mean sampling size per population in cluster A is 8.5 samples/population and in cluster B it is 10.3. This difference is not substantial. Moreover, in the cases where only a few individuals were sampled, the populations were also small, i.e., all or the majority of the detected individuals at the given locality were sampled. Such small populations usually indicate that the genetic variability would be low due to the effects of limited gene flow and inbreeding [58,59]. In addition, some of the individuals were more clustered together at the sampling localities (e.g., M. claireae 2), while some were more distant from each other both horizontally and vertically (to a maximum of 500 m as mentioned in Materials and Methods), which is the case of, e.g., M. claireae 1 where the samples were found at altitudes in the range of 100 m. These discrepancies, sometimes caused by the biology of the respective species and sometimes by the conditions at the locality, might have also affected the observed genetic diversity within and between populations.
The results shown in the dendrogram in this study are different to those found in previous molecular studies on Macrocarpaea [43,45]. These discrepancies are probably due to the molecular markers used (AFLP versus ribosomal and plastid loci), a different sampling pattern (different number of individuals and different species included) and missing data in one of the species showing different positions in this study and in previous publications (e.g., M. claireae analyzed previously was missing plastid data which might have affected the analyses, as for other data both plastid and ribosomal markers were used, see Appendix S1 in [45]. In the main phylogeny ( Figure 2) in this publication [45], M. claireae was sister to M. quizhpei, while in Appendix S3, it was sister to M. xerantifulva. Conflicts between plastid, mitochondrial, ribosomal and nuclear data are commonly reported; also, AFLP data are not particularly convenient for phylogenetic reconstruction. The figure presented here is a dendrogram, not a phylogenetic tree; however, the results presented here support the sister relationship of M. claireae and M. xerantifulva, which also corresponds to their morphology and other biological traits. As previously mentioned, more extensive sampling and application of different methods are needed for more precise phylogenetic conclusions.

Sampling
Samples of 13 Macrocarpaea species were collected between 2011 and 2013 at 37 localities in nine regions of Ecuador and Peru ( Figure 2, Table 1). Eleven of the species are small cryptic trees (3-6 m), with minor differences in their morphology ( Figure 5; [28,35].  (Figure 2). This valley is connected in the south to the main Rio Marañón valley, which is characterized by the vegetation of seasonally dry tropical forests (SDTF). Since Macrocarpaea species generally occur in relatively small, sparse populations that are often difficult to locate, it was not possible to follow a predefined sampling design. Most species were sampled from one to a few and sometimes small populations. The exception is M. xerantifulva, that was sampled more intensively from 18 populations throughout its distribution range. Since no preliminary information was available about the population structure for any species in the genus, we arbitrarily decided that plants from patches separated by at least 500 m belonged to different populations. Some species are branched from the base; therefore, distant but random sampling ensured the collection of genetically different individuals. With its large, night-blooming, yellow-to-greenish-colored bell-shaped flowers that produce large quantities of nectar, Macrocarpaea is geared towards nocturnal bat pollination. These primary pollinators facilitate outcrossing across scattered populations. The nectar is also appreciated by other nocturnal visitors such as moths, as well as early diurnal visitors such as hummingbirds and butterflies that arrive before the corollas fall.

DNA Extraction and Genotyping
Total genomic DNA was extracted from silica gel-dried leaf samples with standard cetyltrimethylammonium bromide (CTAB)-chloroform extraction followed by isopropanol precipitation and ethanol washing [60]. The DNA samples were eluted in water and purified for a second time using the DNeasy plant mini kit (Qiagen Ltd., Crawley, UK), following the Mini protocol and starting at step 11 (i.e., after the lysis part of the protocol). . Fragment sizes were estimated with GeneMapper v4.0 (Applied Biosystems, Foster City, CA, USA) using the AFLP default peak detection parameters. Marker bins were set manually for each primer combination and only fragments within the range of 50-500 bp with a peak height of a minimum of 50 relative fluorescent units (rfu) were considered for generating a marker (bin). Peak scoring on the bin sets was performed automatically by the software and resulting tables with peak heights and sizes were exported and processed with scanAFLP v. 1.3 [63] using the default threshold options and following the steps (2) and (3) described in Hermann et al. [63]. This script assesses presence/absence within a particular locus, relying on locusspecific fluorescent signal intensity thresholds estimated from qualities of the fragment signal intensity distribution (0 for absence and 1 for presence of a band at a particular position). For each primer combination, we exported the binary data matrix, referred to as MatrixB in Hermann et al. [63], that we assembled in a single binary matrix containing a total of 522 loci. At this step, loci present in fewer than 10 individuals and more than all the individuals minus 10 were discarded. The matrix with 306 remaining loci was used to estimate the locus-specific error rate on duplicated samples by applying the mismatch approach of Bonin et al. [64]. Every locus with an error rate larger than 10% was removed from the binary matrix, resulting in a dataset of 203 loci for 345 individuals with a mean genotyping error rate of 4.1%.

Population Structure
The hierarchical structure in the AFLP dataset was investigated using the Bayesian genotype clustering method adapted to dominant markers, implemented in the program STRUCTURE v. 2.3.4 [65,66]. The program allows estimation of the most likely number of genetic clusters in a particular dataset by grouping individuals such that Hardy-Weinberg equilibrium is maximized within clusters. The optimal number of genetic clusters K was determined using the modal distribution of the ∆K statistic of Evanno et al. [48] with STRUCTURE HARVESTER v. 0.6.94 [67]. In datasets with complex structures, the Evanno approach tends to detect the uppermost level of hierarchical structure (i.e., the main divergent groups), but not necessarily the optimal number of populations. As our dataset is composed of potentially fully differentiated populations (species), after the first STRUCTURE analysis that suggested the existence of dichotomic clusters (K = 2), the dataset was subset according to the results of the individual assignments for K = 2 and analyses were repeated for each cluster independently. For the analysis of the complete dataset, K = 1-16 was used, while analyses on the subsets were performed for K = 1-12 and K = 1-8 based on the estimated number of populations. For each type of analysis and each K value, three replicate chains of 175,000 MCMC iterations were run and the first 25,000 iterations were discarded as a burn-in. STRUCTURE was used with the default settings except for the application of the recessive allele option and the admixture model with correlated allele frequencies. The average membership coefficients for the three simulation runs of a given K value were generated with CLUMPAK v. 1.1 [68].

Descriptive Statistics
Allele frequencies were estimated for each population from AFLP fragment frequencies with the Bayesian method of Zhivotovsky [69], implemented in the software AFLPsurv v. 1.0 [70], using a non-uniform prior distribution and assuming that populations were at the Hardy-Weinberg equilibrium. The resulting allele frequencies were used to estimate the expected heterozygosity or Nei's gene diversity (H j ) [71] for each population and to compute the genetic differentiation across all populations (F st ) using 1000 permutations. A thousand matrices of Nei's unbiased genetic distances were computed by bootstrapping over AFLP loci in AFLPsurv and exported to PHYLIP software package v. 3.6 [72] to build a neighbor-joining (NJ) dendrogram with bootstrap support on branches. To compare the portioning of the genetic variation between our a priori species hypothesis and the clusters identified at the issue of the two steps clustering analyses with STRUCTURE, two hierarchical analyses of molecular variance (AMOVA) were performed in Arlequin v. 3.5.1.3 [73]. Populations were first grouped by species, then by genetic clusters (see Table 1).

Spatial Genetic Structure
To estimate the spatial genetic structure (SGS) in the main genetic clusters identified by STRUCTURE, the pairwise kinship coefficient was calculated with the software SPAGeDi v1.5.a [46]. The estimator for the kinship coefficient for dominant genetic markers developed by Hardy [74] requires the user to specify an estimate of the inbreeding coefficient. Here, we assumed a null inbreeding coefficient. Then, kinship coefficients were associated with paired spatial distance in correlograms with 9 distance classes for the cluster composed of M. xerantifulva and M. claireae (0.1, 0.5, 2, 5, 10, 20, 50, 100, 200 km) and with 11 distances classes for the cluster composed by the other species (0.1, 0.5, 2, 5, 10, 20, 50, 100, 200, 400, 800 km). These relatively large distance classes, compared with what is found in the literature, were selected to accommodate our sparse sampling design. While it does not allow us to determine the fine SGS, it will already indicate how genetic structure is influenced by distance. The significance of the average kinship coefficient in every distance class was tested using 1000 permutations of spatial locations and individuals, while standard errors were estimated by Jackknifing across loci. The permutation procedure also allowed testing of whether the regression slope of the kinship coefficient over the distance was significantly negative, as expected under a scenario of isolation by distance. Both the kinship coefficient and the regression slopes are affected by the sampling design and thus are not appropriate for comparison between species. To compare the strength of SGS between genetic clusters, the Sp statistic [53] was computed as Sp = −b F /(1 − F (1) ), where b F is the slope of the regression of kinship against ln distance and F (1) is the average kinship of individuals in the first distance class (here 0-100 m). Because this statistic is calculated on the logarithm of the distance it is less affected by the sampling design and thus has the desirable advantage of being comparable across studies.

Conclusions
We showed that the species delimitation in this young Macrocarpaea species complex composed of several morphological cryptic species requires new consideration. However, the highly asymmetrical sampling in terms of the number of individuals and populations per species, as well as the relatively small AFLP dataset included in this study, does not allow unambiguous estimation regarding their genetic structure, phylogeny, evolu-tion history and prevailing evolutionary processes. More extensive sampling with even representation of all populations and application of modern genetic methods (such as next-generation sequencing) and corresponding analyses more suitable for population genetics, evolutionary reconstructions and related questions are required for more precise conclusions. This is especially valid for the genetically complex cluster B that regroups the essential of the supposed taxonomic diversity in this species complex.
Nevertheless, we were able to demonstrate that M. claireae and M. xerantifulva may form a single species that evolved in relative isolation from the rest of the species complex. Relying on denser sampling, we show that this species has relatively low gene diversity that could have resulted from a recent demographic bottleneck. We suggest that the intensification of drought induced by the PCO in the Rio Chinchipe valley might have compressed this species in refugia. While remaining highly speculative, this statement suggests that the different Andean valleys might have responded in a dramatically different way to the PCO. The easternmost valleys remain wet while the more internal ones, which are currently in contact with dry systems such as the Rio Marañón, suffer from drought. However, it would be highly desirable in the future, with the application of more modern methods and extensive sampling, to perform a reevaluation of the effect of the Pleistocene climatic oscillations compared to other potential evolutionary processes, as well as mountain geomorphology and climatic and potential other influential factors to confirm or correct our findings.

Simple Summary
The effect of the Pleistocene climatic oscillations on species evolution in the neotropical mountains is poorly known. Here, we used amplified fragment length polymorphism (AFLPs) genetic markers to analyze the phylogeography and population genetics of several lineages belonging to the plant genus Macrocarpaea (Gentianaceae) occurring in the tropical Andes. We test which climatic scenario was the most plausible during the Pleistocene glacial periods and what impact the abiotic changes had on montane forest populations.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/plants12081710/s1, Figure S1: Outputs of the STRUCTURE analyses. On the left side-plots of the ∆K statistic of Evanno et al. (2005) for detecting the number of K groups that best fit the data. On the right side-the plot of mean likelihood L(K) and variance per K value from STRUCTURE.  Data Availability Statement: Additional datasets generated and analyzed in this study that are not presented here or in the Supplementary Materials are available upon request from the corresponding author.